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PROBLEMS IN ESTIMATING FEDERAL 
GOVERNMENT EXPENDITURES 


M. Coun 
United States Bureau of the Budget* 


This paper discusses a number of problems in estimating Federal 
expenditures and a number of problems in interpreting such estimates 
as are publicly available. For long-range projections, the paper argues 
that a program-by-program analysis of budgetary issues and trends is 
superior to mechanical projection of expenditures as a function of gross 
national product. For a one-year projection, it is pointed out that the 
Federal budget should be just a starting point for the forecaster because 
the expenditure estimates in the official annual budget document can- 
not be viewed as forecasts. The reasons for this include: the long lead- 
time required in the preparation of the budget; the dependence of 
some program results on economic and weather conditions that can 
change rapidly; and the fact that by its very nature the budget neces- 
sarily reflects Presidential policy recommendations rather than a de- 
tached forecast of the final results of legislative and executive branch 
actions. For quarterly and monthly forecasts, a program-by-program 
method of appraising expenditure trends is also advised. 


1, INTRODUCTION 


HERE are sometimes wide variations between successive estimates of Fed- 
fo expenditures covering the same period, to say nothing of the differ- 
ences between advance estimates and eventual results. Some of the variations 
can be explained by the problems inherent in the nature of the budget and 
governmental processes. This paper discusses a number of problems in esti- 
mating Federal expenditures and a number of problems in interpreting such 
estimates as are publicly available. It addresses itself to three types of esti- 
mates: the long-range projection, annual estimates, and forecasts for parts of 
a year (such as future calendar quarters). 

Further, this paper addresses itself to the concept of Federal spending known 
as “budget expenditures,” although the comments made are also applicable 
to estimates or forecasts of total Federal payments to the public (7.e., con- 
solidated cash outlays) and of Federal expenditures on income and product 
account. The methods of getting from one of these concepts to another have 


* This article consists largely of a paper delivered at the annual meetings of the American Statistical Association 
in Chicago, Illinois, on December 29, 1958. The writer acknowledges with gratitude the help and suggestions of his 
colleagues, for which the aid given by Wilfred Lewis, Jr., Naomi R. Sweeney, and Robert L. Hubbell deserves spe- 
cial mention. The views expressed, however, are the personal views of the author and are not necessarily the same 
as those of his colleagues or the Government agency with which he is associated. 
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been described elsewhere, and frequently present some thorny problems in 
addition to the ones discussed here.! 

The experience of the calendar year 1958 highlighted for many economists 
and statisticians the wide swings that can take place in estimates of Govern- 
ment spending for a future period. In January 1958, the President sent the 
fiscal year 1959 budget to the Congress, estimating expenditures of $73.9 bil- 
lion for the year ending June 30, 1959. By September 1958, this estimate was 
revised upward by $5.3 billion to $79.2 billion, a change of 7 percent in 9 
months. It was again revised upward, to $80.9 billion, in the President’s 1960 
budget transmitted to the Congress in January 1959. When the final tally was 
taken in July:1959, the total of Federal budget expenditures for the year July 1, 
1958, through June 30, 1959, was $80.7 billion. 

However, before examining some of the problems of an estimate for a year 
which starts 6 months or so after the date of the estimate, let us look at more 
ambitious undertakings: 


2, LONG-RANGE PROJECTIONS 


Recognizing the usefulness of economic and budget forecasts in their work, 
many economists employed by defense contractors have recently undertaken 
to help formulate business budgets by preparing projections for 5, 10, and 15 
years ahead. It has become fashionable among some of these forecasters to 
use roughly this process: First, they make a full-employment projection of 
GNP. Then they project Federal Government expenditures as some magic 
proportion of GNP on the basis of recent trends. Next, they project defense 
spending as some other magic proportion of total Federal purchases. Finally 
if they are optimistic, they project their own contracts as a rising percentage of 
defense spending.” 

This process may satisfy the contractor’s urge for an orderly guide to his 
own planning problems, and the objective of informed business planning shou.d 
be heartily endorsed. But a good deal more than regression line extrapolations 
and constant proportions are required if much faith and trust are to be placed 
in the projections. 


1 See Colm, Gerhard, and Young, Marilyn, “The Government Sector,” Problems in the International Compari- 
son of Economic Accounts, (Volume 20, Studies in Income and Wealth), National Bureau of Economic Research, 
Princeton: Princeton University Press: 1957. See also Economic Report of the President, transmitted to the Congress, 
January 20, 1959, Washington: United States Government Printing Office, 1959, Appendix table D-54, p. 201. 

“Payments to the public” represent the more comprehensive concept, since they include outlays of trust funds 
and Government-sponsored enterprises. They exclude accruals and intragovernmental expenditures which are not 
paid in cash. Federal expenditures on national income and product account also include trust fund outlays, but 
exclude purchases of land or existing assets, financial transactions such as loans, and, with respect to timing of 
purchases of goods and services, reflect production or deliveries rather than checks issued or checke paid. 

A related iamily of problems for the economic forecaster is the inadequacy of any of the existing statistics of 
Government expenditures as a measure of the impact of Government activity on the economy. There are many 
examples, but the insurance and guaranty programs will serve to illustrate the point. The economic impact of 
direct Federal housing loans and of federally guaranteed private housing loans may be very similar, yet a direct 
loan is fully reflected while the guaranty program affects budget expenditures and payments to the public only to 
the extent of administrative costs. Federal expenditures on income and product account exclude even the loan 
transactions (implying a negligible economic impact). At a later time, the expenditure of some or all of the loan 
funds is included in another sector of the economy when the borrowed money is spent; e.g., consumer expenditure 
or business investment. 

2 It should be emphasized that there have been more refined and more satisfactory approaches, such as the 
study by Kast and Rosenzweig of the University of Washington, Gross National Product and Military Spending, 
1968-1975, Seattle, Washington: Boeing Airplane Co., September 1958. 
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The most realistic method for estimating Federal Government expenditures 
should be one which develops logical and systematic approaches for as many 
of the segments of the budget as possible. One step could be the use of corre- 
lations with operating and cost data and with gen: val economic statistics, but 
these relationships should then be tempered by taking advantage of the judg- 
ments and evaluations of informed experts on various phases of Federal activi- 
ties with respect to probable biases, the effects of possible economic develop- 
ments, and emerging changes in policies. 

In other words, forecasting Government spending over a long term should 
not rely exclusively on purely mechanical determination. The budget can be 
broken into significant sectors, and Government programs scrutinized indi- 
vidually in terms of pertinent factors—population trends, changing needs, 
changes in prevailing notions of Federal responsibility, and some assessment 
of political palatability of possible alternatives.* 

Long-range projections which are done carefully, and which are checked 
against the available knowledge about individual Federal programs, can be 
good indicators of general trends over time. However, the fit for any one future 
year is likely to be poor. Outlays for defense and international purposes, for 
example, have been stimulated by a number of unforeseen shocks and have 
exhibited an erratic step-like upward growth in the post World War II period 
in response to Soviet challenges. First there was the Marshall Plan; then came 
the Truman Doctrine; then Korea, and a stimulus to greatly enlarged defense 
spending, from $13 billion in 1950 to $46 billion in 1952. Defense spending de- 
clined for a few years after the end of the Korean War. It was tending toward 
relative stability until the advent of Sputnik, when another sharp upward 
movement began, and when a new Government agency for space research and 
exploration was established. This step-like movement may well be the pattern 
over the long-run future, as long as the cold war remains with us, but there 
need not be upward steps every year. 

Although a program by program analysis is the best way to make reasonable 
forecasts of trends, such an exercise contains a built-in bias that none of us 
can correct. How many of us just 20 years ago regarded missiles and rockets— 
unmanned aircraft—as more than comic strip fantasy? Who of us had any 
comprehension of atomic energy and the influence it would have on the budget? 
With our limited perspective we tend to project only problems and programs 
that we either have now, or can reasonably imagine, and hence most of us 
are bound to misrepresent or under-represent the great changes that probably 
will take place in future budgets. 

This kind of misrepresentation or oversight, however, is less likely to take 
place in a projection which analyzes trends in individual programs than in a 
mechanical extrapolation, whether or not it is based on GNP. The great glory 
of the mechanical projector is that he can find the revenues for a few pet proj- 
ects not too far off in the future without any increase in tax rates. A growing 


# An excellent example of this technique which has been published recently is Eckstein, Otto, Trends in Public 
Expenditures in the Next Decade, (Supplementary Paper) New York: Committee for Economie Development, 1959. 
This subject also received very searching discussion at the National Bureau-Universities Committee’s Conference 
on Public Finances held at Charlottesville, Virginia, April 10-11, 1959, proceedings of which are to be published 
shortly. 
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economy gives this hope to the military contractor, to the proponents of school 
aid, of urban renewal, of river basin development, of health research and hospi- 
tal construction, et cetera, et cetera, and even to those who think the greatest 
need is for tax reduction. But when all the dreams are added together, is the 
total still the total that is projected mechanically? This is the question that 
cannot be answered without program analysis and well-weighed judgments, 
and without evaluation of the forces that might be strong enough to make in- 
creases in tax rates desirable (or at least acceptable), or decreases in tax rates 
unavoidable. 

It may turn out that a carefully prepared long-range projection would indi- 
cate future levels of total expenditures that are a relatively stable proportion 
of a growing GNP. Relative stability in this proportion, however, could mean a 
variation of 1 percent of a GNP that is approaching $500 billion annually. Such 
a finding would be a convincing demonstration that Federal spending has a 
tendency to grow with GNP, but a $5 billion variation is still a large dis- 
crepancy, particular!y if one used it as a basis for counting on new defense con- 
tracts or a reduction in taxes or any other specific activity. 


38. ANNUAL ESTIMATES 


Economists, statisticians, and financiers find annual estimates of Federal 
expenditures most useful when they represent forecasts. And forecasts for a 
given year in the future certainly vary in their reliability with the interval of 
time between the date of estimate and the year to which the estimate applies; 
that is, the closer the period, the better the forecast. 

The hasic source for most estimates or forecasts of Federal spending for the 
coming year is the annual budget of the President. Estimates for the year be- 
ginning the following July 1 are set forth in summary and in great detail every 
January in the budget. But do these estimates really qualify as forecasts? 

Actually, the budget is a Presidential planning statement rather than a fore- 
cast. It reflects the policies and programs that the President judges to be best 
for the country. It does not attempt to forecast what the Congress will do, but 
what the President thinks the Federal Government as a whole should do. 

To make the budget a complete and comprehensive financial plan, the budget 
process requires working out a huge mass of detail. Budget expenditures are 
made from about 850 appropriation and fund accounts. Spending in any one 
year is affected by contracts made in prior years as well as the budget year. 
For each account, therefore, the budget document shows not only the esti- 
mated expenditures, but also the balances brought forward that will be avail- 
able for expenditure, the amounts expiring, and the estimated new authority 
required for the budget year. 

The picture is further complicated by other financial details. Obligations are 
required to be shown for each account for the objects to be purchased—per- 
sonal services, travel, communications, supplies, equipment, etc. Detail is also 
set forth on the activities that will be conducted under each appropriation-— 
administration, research, benefits, resource development, etc.—and both obliga- 
tions and expenditures are related, where possible, to anticipated workload 
data. 
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Smithies described some of this detail in his book on the United States budget 
system, and in his opinion there seemed to be too much detail for the budget 
to serve as well as it should as the President’s plan, to say nothing of the relia- 
bility of the estimates as forecasts.‘ 

An interesting and noteworthy point which should be made is that a number 
of rather useful statistical tools have been developed relating the costs of some 
Federal programs to workload and economic variables. For example, postal 
volume correlates closely with population and real national income; sales of 
savings bonds, up to a few years ago, at least, with wages and salaries. Veterans’ 
pensions can be related to a projection of the veteran population by age groups. 
Compensation to injured Federal employees or injured longshoremen it is tied 
to assumptions on employment and accident rates. 

Workload budgeting has these aspects: discovery of factors which control 
volume and rates of payment, time studies to relate staff to volume, and good 
cost accounting. But, while statistical techniques add precision to the esti- 
mates, given the economic assumptions, forecasting is a somewhat different 
matter. This is brought home dramatically by considering that nowhere in 
Government budgeting are better correlation and other statistical tools avail- 
able than in the Department of Agriculture, yet no budget expenditures are 
as difficult to forecast as the outlays of the Commodity Credit Corporation 
for farm price supports. 

In contrast, looking at the public works field, expenditure estimates for con- 
struction can be made with fair accuracy, since they are in large part deter- 
mined well in advance by work begun and already under way. Even congres- 
sional changes in the budgeted amounts of new authority affect spending only 
after a considerable lag. The contrast can be grasped quickly from Table 722 
observing the differences between the “Original budget” and “Actuals” col- 
umns, particularly for agriculture and natural resources. The three estimates 
shown for each fiscal year represent respectively: (1) the original budget pub- 
lished in January, six months before the start of each indicated year; (2) the 
review published shortly after Congress adjourns, usually sometime between 
early August and mid-October; and (3) the budget for the next year, published 
in January shortly after the midpoint of each indicated year. 

Budget detail is built up from estimates prepared initially in the operating 
units of the many agencies. During the process, the estimates are reviewed 
many times. The agency executives may desire a different balance among the 
programs of an agency than is first obtained, frequently requiring a reworking 
of the detail and sometimes a different agency total. In the Executive Office of 
the President, the estimates for each agency are again reviewed—for balance 
among agencies and for consistency with the President’s overall program, in- 
cluding fiscal objectives. All during the process, new policy decisions are being 
made, and old ones changed. Many details of the budget process and the char- 
acter of many of the decisions have been described in Burkhead’s recent vol- 
ume on Government budgeting.® 


4 Smithies, Arthur, The Budgetary Process in the United States, New York: McGraw-Hill Book Company, Inc., 
1955, particularly pp. 101-14 and pp. 229-77. 

‘ Burkhead, Jesse, Government Budgeting, New York: John Wiley & Sons, 1956, especially chapters 10, 11, 
and 12. 
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TABLE 722. BUDGET EXPENDITURES BY FUNCTION 


Comparison of Actuals with Estimates, 1956-1958 
(Fiscal years. In billions of dollars) 


1956 1957 1958 


Estimates Estimates Estimates 


Function 
Origi- | Mid- | Subse- Origi- | Mid- | Subse- Origi- | Mid- | Subse- 

nal year | quent nal year | quent nal year | quent 
budget! review |budget budget) review | budget budget} review | budget 


Major national security 
and international af- 
fairs and finance 

Veteraus’ services and 
benefits 

Labor and welfare 

Agriculture and agri- 
cultural resources 

Natural resources 

Commerce and housing* 

General government 

Interest 

Allowance for contin- 
gencies 


—_ | — 
Total 62.4 | 63.8 | 64.3 | 66.5 | 65.1 | 69.1 | 68.9 | 69.4 | 71.8 | 72.0 | 72.8 
* For 1957 and 1958 excludes Federal-aid highway expenditures which, starting in the fiscal year 1957, are 

made through a separate trust fund 
NOTES: (1) Totals are those which appear in d t indicated; however, originally published figures for 
the Labor and welfare, Natural resources, and Commerce and housing functions have been 
roughly adjusted for comparability with the functional classification used in the 1959 Budget. 

(2) Detail may not add to totals due to rounding 


It cannot be overemphasized that the budget process requires a long lead- 
time to obtain all the detail presented and to allow sufficient time for policy 
evaluations. Intensive work starts each spring on the budget for the fiscal year 
which will end two and one-quarter years later. Policy discussion and planning 
begun in the spring turn into numerical estimates in the fall, and are usually 
given the printer in November and December for printing and transmittal 
to the Congress in January. By the time the budget is transmitted, parts of 
it might be out of date in that a few months’ actual data have become available 
that were not used in making the estimates. 

Once the budget is sent to the Congress, six months before the start of the 
fiscal year, it is subject to further change. The Congress may or may not ac- 
cept its recommendations, and the President may amend his own recommenda- 
tions because of later circumstances. 

And of course a number of basic estimating assumptions, including economic 
activity and even the weather, can turn out to be different. Changes in the 
economic assumptions—prices, incomes, employment—affect revenues and 
trust fund expenditures (such as unemployment benefits) relatively more than 
budget expenditures, but all are affected. Interest outlays respond rather 
promptly to changes in interest rates for that portion of the annual borrowing 
which is short-term and must be refunded often. Patients receiving veterans’ 
medical care and the number of clients on public assistance rolls both increase, 


= 
41.8 | 40.9 | 41.5 | 42.5 | 42.5 | 43.0 | 43.3 | 45.2 | 45.8 | 45.6 | 46.3 | 46.4 
4.6] 4.8] 4.8] 4.8] 4.9] 4.8] 4.9] 4.8] 5.0] 5.0] 5.0] 5.0 ee 
2.7] 2.8] 2.8] 2.8] 3.0] 3.0] 3.0] 3.0] 83.5] 3.4] 3.4] 3.4 
2.3] 3.4] 3.4] 4.9] 3.4] 5.7] 4.7] 46] 5.0] 5.0| 4.9] 4.4 
2.71 8:61 1.6] 1.8] 2.2] 2.9) 2161 1.6] 1:4 
6.4] 6.8] 6.9] 6.8] 7.1 CAT THT 
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under existing laws, when employment in the nation declines and, of course, 
all programs are affected by price changes. Estimated outlays for the agricul- 
tural price support programs of the Commodity Credit Corporation in 1959 
were raised by 70 percent—$1.6 billion—between January of 1958 and Octo- 
ber of the same year.® 
For most large programs, and certainly for budget and consolidated cash 
totals, the user who treats annual budget estimates as forecasts does so at 
considerable risk. This is not a new and startling development, but it increases 
in importance as the budget gets larger. Bassie addressed himself to the basic 
question in his new book on Economic Forecasting, asking, 
“Does the budget as it is sent to Congress represent a satisfactory basis for forecast- 
ing?’ The answer to this question must be an unequivocal ‘No!’ It is useful in arriving 
at a forecast but cannot in itself be considered to present the kind of estimates needed. 
“There are several qualifications that have to be immediately noted. Legislation 
will modify existing programs to some extent, and the budget proposals for new 
legislation may or may not get through. Business developments will modify results 
through their effects on flexible spending and tax programs and may even induce the un- 
dertaking of new programs. Furthermore, there are likely to be biases of one kind or 
another in the estimates, reflecting the policies of the administration or the attitude 
of the agencies involved, since both have to follow arbitrary assumptions to some ex- 
tent in order to arrive at specific figures.”? 


Bassie went on io point out that the deficit for fiscal year 1952 was esti- 
mated in January 1951 at $16.5 billion, but turned out to be $4 billion. The 
discrepancy in absolute terms, although the direction is different, is about the 
same as that between the January 1958 estimate of a $0.5 billion surplus and 
the September 1958 estimate of a $12 billion deficit for fiscal 1959. The actual 
deficit for fiscal 1959 turned out to be $12.5 billion. 

For each of the postwar fiscal years, the actual totals are compared with the 
various official estimates in Tables 724 and 725, covering, respectively, the 
conventional budget totals and the consolidated cash totals. Since new Gov- 
ernment activities often are not foreseen, and since some Presidential pro- 
posals are not adopted, the programs covered in the various expenditure col- 
umns for the same year are not necessarily all the same. Also, the variation 
from column to column for receipts in a given year may reflect differences in 
tax rates as well as differences arising from economic conditions and from esti- 
mating variation or error. 

Desnite these qualifications of the data, there is much food for thought in 
these data. Many interesting questions can be asked but few conclusive an- 
swers given. For example, does the original estimate of a large budget deficit 
seem to restrain expenditures, or does the estimate of a closely balanced budget 
seem to impose a greater restraint? Is there anything (politically or adminis- 
tratively) to be gained by improving on the estimate of a large deficit, or to be 
lost by falling short of an estimated balance? When the executive and legis- 
lative branches are controlled by the same party, is there a closer adherence 


6 For a fuller discussion of the relationship of economic variables to Federal expenditures, see Lusher, David W., 
“The Stabilizing Effectiveness of Budget Flexibility” in Policies to Combat Depression, A Conference of the Univer- 
sities—National Bureau Committee for Economic Research, National Bureau of Economic Research, Princeton: 
Princeton University Press, 1956, pp. 77-89, and the comments by Cohn, 8. M., same volume, pp. 90-100. 

7 Bassie, V. Lewis, Economic Forecasting, New York: McGraw-Hill Book Company, Inc., 1958, p. 196, 
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TABLE 724. COMPARISON OF ACTUAL BUDGET TOTALS WITH ESTIMATES, 
1947-1959 


. (In billions of dollars) 


Budget receipts Budget expenditures Budget surplus (+) or deficit (—) 


Estimates 


Mid- 
year 
review 


1947 
1948 
1949 
1950 
1951 
1952 
1953 
1954 
1955 
1956 
1957° 
1958° 
1959 


i+ 


Tr ar 


eb oad 


Nor oc 
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(+ 1 
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® No Midyear Review was issued for fiscal year 1951. 

> No estimates of receipts or the deficit were made in brief release issued. 

© Excludes Federal-aid highway expenditures and related revenues which, starting in the fiscal year 1957, 

are accounted for in a separate trust fund. 

4 “Actuals” are preliminary and subject to some relatively small revision. 

NOTES: (1) Some adjustments have been made to published figures so that all amounts shown are generally 
comparable as to coverage and concept and are consistent with present treatment. Thus, even 
though this was not the practice in the earlier years, budget receipts and expendit lud 
refunds of receipts, payroll taxes transferred to the railroad retirement trust fund, etc. None of 
the adjustments affect the surplus or deficit. 

(2) Detail may not add to totals due to rounding. 


to original budget estimates? Speculation about the answers to these questions 
will have to be left to experts in the fields of political science and public ad- 
ministration. 

Certainly, Tables 724 and 725 make it clear that the budget estimates by 
themselves cannot satisfy the purposes of the forecaster. Nevertheless, it should 
be emphasized that for several months, at least, the administration and its 
spokesmen must defend them as forecasts before the Congress. The facts of 
political life and the discipline of public administration make it essential for 
officials of the executive branch to exercise their persuasive powers in attempt- 
ing to convince the Congress to adopt the program and plan recommended by 
the President. Further, it is not feasible administratively to revise estimates of 
all the budgetary details and of the total with great frequency, particularly 
when congressional subcommittees and full committees are modifying pro- 
posals daily. Since today’s revision could be obsolete tomorrow, the tendency 
is to stick with the budget estimates until they become quite unreasonable. 

There have been occasions, and the calendar year 1958 was the most recent, 
when it became necessary or advisable to indicate the need for revision and 
the magnitude of the revision that seemed reasonable. This is usually done by 
the Secretary of the Treasury and the Director of the Bureau of the Budget 


3 
us Origi- | Mid- | Subse- | Actuals} Origi- | Mid- | Subse- | Actuals} Origi- Subse- | Actuals rie 
nal year | quent nal year | quent nal quent 3 t 

budget | review | budget budget | review | budget budget ea ae 
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TABLE 725. COMPARISON OF ACTUAL CONSOLIDATED 
CASH TOTALS WITH ESTIMATES, 1947-1959 


(In billions of dollars) 


Receipts from the public Payments to the public ee gg 
Fiscal Estimates Estimates Estimates 
year 
Origi- | Mid- | Subse-| Actuals} Origi- | Mid- | Subse- | Actuals} Origi- | Mid- | Subse- | Actuals 
nal year | quent nal year | quent nal year | quent 
budget | review | budget budget | review | budget budget | review | budget 
1947 33.2 | 40.9| 41.4] 43.5] 35.6] 38.1 37.6 | 36.9 |— 2.4 |+ 2.8 |+ 3.9 |+ 6.6 
1948 38.7 42.7 46.3 45.4 35.8 37.2 38.5 36.5 |+ 3.0 |+ 5.5 |+ 7.7 /|+ 8.9 
1949 46.4 41.4 42.9 41.6 39.4 40.0 40.1 40.6 |+ 7.1 /+ 1.4 /+ 2.8/+ 1.0 
1950 47.2 41.7 40.9 45.7 bad 46.5 1.5 4.9 2.2 
1951 43.1 ® 49.3 53.4 45.8 ® 49.1 45.8 |— 2.7 ° + .2/+ 7.6 
1952 61.3 ad 68.6 68.0 74.1 s 72.6 68.0 |—12.8 ® — 4.9 |+* 
1953 76.8 74.4 74.9 71.5 87.2 81.2 76.8 76.8 |—10.4 |— 6.8 |— 1.9 |— 5.3 
1954 75.2 75.1 74.9 71.6 81.8 75.5 75.2 71.9 |— 6.6 5|— .2i— .2 
1955 70.8 | 67.5) 66.6] 67.8| 70.7] 69.6] 69.0} 70.6 2.1 |— 2.4 |— 2.7 
1956 68.8; 70.9| 73.5] 77.1] 68.2] 70.6] 71.0] 72.6/+ .6/+ 4.5 
1957 75.4 80.8 81.7 82.1 72.9 77.2 78.3 80.0 |+ 2.4 1+ 3.7 |+ 3.5 |+ 2.1 
1958 85.9 85.9 85.1 81.9 83.0 82.8 84.9 83.4/+3.0/+ 3.1/+ .2/—1.5 
1959 | 87.3] 80.4] 81.7] 81.5] 86.7] 94.1] 94.9] 94.6 0.6 |—13.7 |—13.2 |—13.0 


® No revised estimates were made. 
> “Actuals” are preliminary and subject to some relatively small revision. 
NOTES: (1) To assure comparability over the period shown, the figures for both receipts and payments for 
fiscal years 1947-1949 were adjusted to exclude refunds of receipts. 
(2) Detail may not add to totals due to rounding. 
* Less than $50 million. 


before congressional committees or sometimes in public addresses.* The fore- 
caster would have to rely on his news sources to be alerted to such official re- 
visions. The month of April—three months after presentation of the budget— 
seems to be the earliest that such a revision has been made. 

However, after the start of each fiscal year and after the adjournment of the 
Congress, the Bureau of the Budget regularly prepares and publishes a Mid- 
year Review, presenting revised estimates for the year and describing the dif- 
ferences between the new estimates and the ones made in January in the an- 
nual budget. The revised figures are a much better reflection of the expected 
budgetary outlook than is the budget. They have the advantage of being much 
closer to the end of the year being estimated; they require much less lead-time 
in their preparation and thus can take greater advantage of current actual 
data; and they can reflect the legislative and appropriation actions taken by 
the Congress, although some modification of these is possible through supple- 
mental bills in the next session which might affect the last quarter or half of 
the fiscal year. 

Enough of annual estimates; the forecaster is usually interested in months 
or quarters, in: 


8 For example, see address of Maurice H. Stans, Director of the Bureau of the Budget, at the University of 
North Carolina, April 11, 1958, “Financial Responsibility of Government” (multilith, Bureau of the Budget press 
release, April 11, 1958), and “Remarks ‘yy Secretary of the Treasury Robert B. Anderson before the American So- 
ciety of Newspaper Editors, April 18, i358,” (mimeo., Treasury Department press release, Apri) 18, 1958). 
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4. FORECASTS FOR PARTS OF THE YEAR 


Such estimates for future periods are not made officially by the Govern- 
ment. Individual staff members in several Federal agencies, however, find it 
necessary to make such forecasts to help them do their jobs effectively. Thus, 
techniques have been developed and estimates derived. Even though they 
have no official standing as such, these estimates certainly have to be impor- 
tant factors in such matters as Treasury financing and refunding operations. 

The sources of part-yearly data on Federal financial transactions are the 
Daily Statement of the United States Treasury, and the Monthly Statement of 
Receipts and Expenditures of the United States Government. The latter is the 
report on the budget results, on the trust funds, on investments of major Gov- 
ernment enterprises, and on the public debt, presenting figures on receipts 
and expenditures in great detail. The former is a consolidated state:.ent of 
Treasury transactions only, with much less detail, on a checks paid rather 
than a checks issued basis. 

From what has been said up to now, it should be readily recognized that one 
way not to forecast expenditures for coming months or quarters is to subtract 
blindly the actuals to date from the estimates for the whole year which are 
published in the annual budget. Even the annual estimates in the Midyear 
Review cannot usually serve this purpose without some modification because 
relatively small differences in annual estimates become larger, proportionately, 
when applied to a period of a few months. 

A much better technique is to segregate the larger Government programs and 
those with a volatile expenditure history from the small and steady ones, and to 
look at each large or sensitive program separately. This can be done in various 
degrees of detail. As an example, the following have been found worthy of 
an individual look in an overall exercise of this kind: Department of Defense, 
Mutual Security Program, interest, the Commodity Credit Corporation, the 
Post Office, and the Housing and Home Finance Agency. 

Then, for each separate piece or agency the following procedure is useful: 


1. Consult an expert, if possible, and any recent information, such as news 
articles, press releases, testimony before congressional committees, speeches 
of informed officials, or the independent estimates occasionally made by 
congressional committees (mainly Joint Committee on Internal Revenue 
Taxation and Joint Economic Committee), and make a tentative “ad- 
justed annual estimate” where it seems advisable. In some cases the 
latest available information might be in terms of absolute or percentage 
changes that are expected within the year, and in others it might be an- 
nual totals or changes over a period of years. 

2. Try to discern and separate seasonal and erratic movements from trend 
by arraying the actual monthly or quarterly figures for each program in 
past years. 

3. Establish monthly or quarterly “target” figures for the year which recon- 

cile as well as possible with recent actual data and with the adjusted 

annual estimate, keeping in mind the available information on trends 
and totals, and the seasonal pattern that seems evident. This sometimes 
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requires a series of successive approximations, and frequently, results in 
discarding as untrustworthy some presumably relevant information that 
has been collected. Unfortunately, “dope stories” in newspapers and 
news magazines are frequently inconsistent. 

. Consult the expert again, and discuss differences between his earlier ideas 
and the current state of the forecast, since they may well not be in 
agreement. It helps to touch base in any case, because an expert can save 
the uninitiated from obvious blunders! 


A very significant portion of this exercise is bound to be concerned with the 
military expenditures of the Department of Defense, which comprise in total 
about one-half of all budget expenditures. Here, if time and patience allow, 
refinement within the departmental total is possible and desirable. A monthly 
Status of Funds report is prepared by the Department of Defense, which gives 
monthly data on obligations (including contracts) as well as expenditures, and 
gives both in a detailed breakdown by cost category.® It is possible to capital- 
ize then on the fact that some of the categories such as pay and subsistence 
are fairly stable, given the number of men and women in the armed services, 
and the volatility of expenditures can thus be confined to a few major areas, the 
chief of which is major weapons procurement. In this area, a comparison of 
the general level of obligation with expenditures should indicate at least 
whether spending is headed up or down. Further, some knowledge of expendi- 
ture lags can be brought to bear. For example, Federal expenditure under an 
aircraft contract seem to be made roughly at the rate of 15 percent within 
the first year, 30 percent within the second, and 55 percent spread over the 
third, fourth, and fifth years. 

However, it must be emphasized that a good deal of uncertainty surrounds 
this business of forecasting defense expenditures, and a good deal of work still 
needs to be done. Statistical analysis has not yet been used to full advantage 
on the data presently published by the Department of Defense, and there is 
room for improving the basic statistics on obligations which have not always 
been on a consistent conceptual basis. The time lag in publishing the data, 
usually anywhere from two to four months, is also a handicap. The economic 
forecaster might well wish, however, that as much data were available monthly 
on obligations or contracts in other Government programs as is available for 
defense. 

For those who are interested in figures, Table 728 presents some data by 
quarters, showing separately the large agencies and the agencies with rather 
volatile expenditures. In internal staff work in the Federal Government, a 
more detailed breakdown is used, but this should suffice as an adequate starter 
for those who want to begin collecting relevant information and forecasting 
quarterly expenditures. It should be noted that the quarterly figures shown 
have not been adjusted for possible seasonal variation. Development of fairly 
reliable seasonal indexes is another area where the full force of existing statis- 
tical tools has not yet been applied to the analysis of Government expenditures. 


* E.g., Department of Defense, Monthly Report on Status of Funds by Budget Category, EF AD 340, 30 June 1958 
prepared by the Fiscal Analysis Branch, Office of the Secretary of Defense (Comptroller). 
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5. CONCLUSION 


Forecasting Federal expenditures is still a wide open field in which statisti- 
cians should be able to develop techniques to help improve on the many horse- 
back judgments which are now made by program experts and political seers. 

Once embarked on such a mission, however, the government statistician 
finds himself diverted more and more frequently to help solve operating prob- 
lems. Although the initial purpose of the exploration may have been to improve 
forecasts, statistical techniques help pinpoint operating problems and policy 
issues, and then other statistical techniques can help in their solution. As a 
result, instead of forecasting, the statistician might find himself helping more 
closely in the conduct of the affairs of state, which budget figures, after all, 
do describe. 
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ANALYSIS OF VITAL STATISTICS BY CENSUS TRACT* 


EvizaBetH J. COULTER 
Ohio State Department of Health 
AND 
LILLIAN GURALNICK 
Department of Health, Education, and Welfare 

This report presents the results of a questionnaire survey of official 
health agencies on the current uses of census tract data with special 
reference to studies of health as related to socioeconomic status. Inter- 
esting studies are now being conducted, but for health program pur- 
poses a need is seen for a further extension of use of census tract data. 
Comparison of mortality, for example, of the population in areas of 
equal socioeconomic status in the various cities across the country offers 
new possibilities in the evaluation of the effects of such as air pollution 
and climate on mortality. 


1. INTRODUCTION 


gen report is a summary of the responses to a questionnaire on the uses of 
census tracts for the analysis of health and vital statistics. Early in 1958, 
the Public Health Conference on Records and Statistics sent the questionnaire 
[19] to all cities that had a population of 50,000 or more in 1950 and had been 
subdivided into census tracts. The chief purpose of the questionnaire was to 
provide information on the activities of large cities in the analysis of vital sta- 
tistics according to socioeconomic characteristics of the population. 

The lack of such analyses has been one of the serious gaps in vital statistics 
in the United States. Studies undertaken in the past have generally been based 
on the report of occupation of the decedent shown on the death certificate, or 
of the father as given on the birth certificate. On the whole these studies have 
been neither very extensive nor very successful. There has been another ap- 
proach to describing the mortality differentials between socioeconomic groups. 
In a number of cities mortality rates have been computed for the population 
living in each census tract, and the rates then ranked or correlated with the 
average characteristics of the population in the tract as they are measured in 


* This report has been prepared by the Subcommittee on Problems in Using Mortality Data by Census Tracts. 
This was a Subcommittee of the Mortality Statistics Working Group of the Public Health Conference on Records 
and Statistics (PHCRS). Members include: 
Dr. Elizabeth J. Coulter (Chairman), Chief Statistician, Statistical Analysis Unit, Division of Vital Statistics, 
Ohio State Department of Health 

Mr. Frank C. Bauer, Chief, Public Health Methods, Chicago Board of Health 

Mr. Robert W. Buechley, Associate Social Research Technician, Bureau of Chronic Diseases, California State 
Department of Public Health 

Dr. F. Herbert Colwell, Director, Office of Statistics and Research, City of Philadelphia Department of Public 
Health 

Dr. Otis Dudley Duncan, Associate Director, Population Research and Training Center, University of Chicago 

Miss Frieda Greenstein, Senior Statistician, Bureau of Records and Statistics, City of New York Department 

of Health 

Miss Lillian Guralnick, Statistician, Mortality Analysis Section, National Office of Vital Statistics 

The study, initiated in March 1956, had not been completed when the PHCRS was reconstituted as a consulta- 
tive and collaborative study program of the Public Health Service. The Subcommittee therefore, completed its 
assignment under the sponsorship of the National Office of Vital Statistics, Public Health Service, Washington 25, 
D.C. 
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the Census of Population. The possibilities of this method, which depends on 
knowledge of the place of residence of the decedent, seemed to warrant further 
exploration. 

At the 1956 meeting of the Public Health Conference on Records and Sta- 
tistics, the Mortality Working Group was asked to undertake this task. The 
Conference consists of registrars and statisticians employed in State health 
departments who meet as a group every two years under the auspices of the 
National Office of Vital Statistics in the United States Public Health Service. 
It operates through small Working Groups, each of which concentrates on solv- 
ing problems of interest to public health statisticians and registrars. Several 
members of the Mortality Working Group, assisted by consultants who were 
expert in the field of census tract usage, were given the assignment to study 
problems in using mortality data by census tracts as relates to the question of 
socio-economic status. In its initial discussion, the group indicated particular 
interest in learning whether any of the methods in use to rank mortality rates 
by socioeconomic characteristics of census tracts could be applied equally well 
to a number of cities and thus make possible comparisons of economic classes 
between cities as well as within cities. Perhaps a method could be found that 
would permit generalizations to be made for the urban population of a State, 
a geographic region, or even the United States. 


2. THE CENSUS TRACT 


The concept of census tracts was originated by the late Dr. Walter Laidlaw 
in New York City in 1906 [1]. At his request the Bureau of the Census made 
tabulations of data for 1910 by census tracts for New York and for the seven 
next largest cities. Tract data were again tabulated for the same eight cities in 
1920, and in 1930 this number was increased to 18. By 1940, there were 60 cities 
for which tract data were available; by 1950, there were 64 cities, and it is hoped 
that by 1960 census tracts will have been established in all cities of 50,000 o1 
more population. 

Each tract is a small area, having a population usually between 3,000 and 
6,000, that is fairly homogeneous in its characteristics. The tract areas are 
established also with consideration for uniformity in size and with regard for 
natural geographic features. The physical boundaries of a tract are intended 
to remain unchanged from census to census but, under compelling circum- 
stances, they have been revised or the tract subdivided [28]. 

The Committee on Census Enumeration Areas was established by the Amer- 
ican Statistical Association in 1931. Local committees were organized in many 
cities and local key persons were appointed to maintain liaison with the Bureau 
of the Census and to stimulate and coordinate activities of local consumers of 
tract data. Uses of tract data have been reported annually at the meetings of 
the Committee on Census Enumeration Areas during a session of the American 
Statistical Association meeting. Several publications designe’ to assist and 
inform users of tract data have been produced. The first “Cer-u~ Tract Man- 
ual” was issued in 1934 by the Bureau of the Census. Subsequent editions were 
prepared by Howard Whipple Green and the current edition (1958) has again 
been issued by the Bureau of the Census [27]. A questionnaire survey similar 
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to the present one was made in 1945 and resulted in a paper prepared by 
Howard Whipple Green, “Report of Activities in Census Tract Cities” [11]. 


3. THE PRESENT SURVEY 


A questionnaire was developed to obtain information on city and metro- 
politan area uses of census tracts for vital and health data in tracted areas in 
the United States. Specific questions were included to obtain information on 
available coding guides and maps by census tract, health data codec and tabu- 
lated by tract in local health departments, use of socioeconomic classes based 
on tract characteristics, and intercity comparisons of data for tracts of the same 
characteristics. 

The group of localities to be included in the study was established on the 
basis of a list supplied by the United States Bureau of the Census at the end of 
1957 on the tract status of local areas in the United States. The decision was 
made to send the questionnaire to cities of 50,000 population or more in 1950 
that were shown by the Census Bureau list as tracted in 1950 or definitely ex- 
pected to be tracted in 1960, as well as to counties in standard metropolitan 
areas tracted in 1950. Information which became available after the queries 
had been made [2], indicated that there were 35 cities of 50,000 population or 
more that were tracted in 1958 but were not queried chiefly because they were 
shown in the list used in determining cities to include in the survey as possibly 
not to be tracted in 1960. 

The questionnaire was sent to the official health agencies in the group of 
localities selected for study. The mailing list of these agencies was established 
by use of the “Directory of Full-Time Local Health Units” [29]. The complex- 
ity of local health organization in the United States, with a number of different 
health departments often serving one metropolitan area, presented difficulties 
in preparing the mailing list. Where there was a separate health department 
for the central city of the metropolitan area and for the remaining part of the 
county, questionnaires were sent to both jurisdictions, as well as to health de- 
partments in other counties of tracted standard metropolitan areas. No effort 
was made, however, to reach places of under 50,000 that were included in a 
tracted county but had a separate health department. Determination of the 
area served by each county or city health department presented a problem. 

The questionnaires were mailed in February, 1958. By use of follow-up let- 
ters in the next few months, replies were obtained from all but eight of the 
health departments queried. Approximately half of the questionnaires were 
completed by the local health officer and slightly less than one-fourth by per- 
sons engaged in registration or statistics. The remaining respondents included 
administrative assistants, nurses, public health analysts, health educators, and 
clerks. 

Replies to the questionnaire indicated that many agencies outside the health 
department were interested in health data by tract. The types of agencies noted 
most frequently were local health or welfare agencies, chambers of commerce, 
city planning groups, other departments of government, and universities. 

Tabulation of the results of the questionnaire presented a number of diffi- 
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culties, owing in part to weaknesses in design of the questionnaire. In general, 
there were variations in interpretation of the questions by persons of diverse 
backgrounds and sometimes limited experience with census tracts. Several tab- 
ulating problems resulted. First, because of the design of the questionnaire, it 
was often not possible to determine whether an incompleted answer to a ques- 
tion meant “no” or “unknown”. Second, the replies to various parts of indi- 
vidual questionnaires were sometimes inconsistent. Third, information for the 
standard metropolitan areas presented particular difficulties, because it was 
obtained from replies by city or county health departments within them, which 
sometimes resulted in a number of different answers for the same metropolitan 
area. 

No follow-up letters were sent to check inconsistencies in the questionnaires 
because this involved too much additional work for the scope of the project. 
Three decisions were made relating to tabulations. First, the replies were ac- 
cepted as given with no editing. Some changes. were suggested by other avail- 
able data but since such information was not uniformly available, no general 
rules for applying it could be developed. Second, in preparing the tables an 
affirmative was included for each response with a definite “yes”. Third, for 
the standard metropolitan areas an affirmative reply was used for each item 
for which at least one local health department in the area replied in the affirma- 
tive, even though responses from one or more health departments in the area 
were “no”, “blank”, or “unknown”. An exception was made, however, where 
affirmative answers indicating a tract program were given in the questionnaire 
for standard metropolitan areas not shown by the Bureau of the Census as 
tracted in 1958. The small number of replies of this type were excluded in the 
sections of the tables referring to standard metropolitan areas. 

Data obtained from the Bureau of the Census showed that approximately 80 
per cent of the cities of 50,000 population or more and 30 per cent of the stand- 
ard metropolitan areas in the United States were tracted in 1958 (Table 734). 
As noted above, 35 of the tracted cities, mostly in the 50,000-99,999 population 
group, were not queried, however, because they were not shown by the original 
dats. used in preparing the mailing list as definitely to be tracted by 1960. The 
data shown in this report refer only to the queried areas. 

Slightly less than two-thirds of the tracted cities queried and about half of 
the tracted standard metropolitan areas had a tract program, as defined by a 
“yes” answer to any one or more questions applicable to the locality (Table 
735). In general, relatively more of the larger cities and standard metropolitan 
areas had census tracts available and also in use (Tables 734 and 735). For 
example, all of the cities of 250,000 population and over were tracted and all 
but two had tract programs. On the other hand, only 70 per cent of the cities 
of 50,000-99,999 population were tracted and less than half had a tract pro- 
gram. The proportion of standard metropolitan areas tracted, as well as the 
proportion with a tract program, showed considerable increase for areas of 
500,000 population or more. 

A total of 79 of the 153 tracted cities and 18 of the 51 standard metropolitan 
areas had coding guides or manuals available by census tract or group of tracts 
(Table 736). Tract area maps by street name were somewhat more frequently 
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TABLE 734. CENSUS-TRACT STATUS OF CITIES AND STANDARD 
METROPOLITAN AREAS, CLASSIFIED BY POPULATION-SIZE 
IN 1950: UNITED STATES, 1958 


(Tabulated from data prepared by the U. 8. Bureau of the Census) 


Cities Standard Metropolitan Areas* 


Tracted| Per- Tracted| Per- 
in 1950 Tracted| and cent | Total |}Tracted| and cent 
Queried} Tracted Queried} Tracted 


50,000 and Over 


50,000— 99,999 
100,000—249,999 
250,000—499,999 
500,000-999,999 

1,000,000 and Over 


* Metropolitan State economic areas were used for New England. 
> Includes twelve standard metropolitan areas only partially tracted, four of population size 250,000—499,999, 
two of population size 500,000—999,999, and six of 1,000,000 and over population. 


available, but only 26 cities and five standard metropolitan areas had such 
maps by house numbers. 

About 40 per cent of the 153 cities were routinely coding health data by 
census tract or group of tracts (Table 737). The most frequently coded items 
were births and deaths, each coded in 58 cities. Forty-four cities were coding 
morbidity, such as tuberculosis, venereal disease, and other communicable dis- 
eases, and 26 cities coded reports of health department services including nurs- 
ing, sanitation, and school health services. Twenty-one cities coded a variety 
of other types of data, such as marriages, tuberculosis contacts and suspects, 
nonhospitalized tuberculosis cases, a venereal disease blood survey, rabies con- 
trol, patients treated at emergency first aid stations, mental retardation, en- 
vironmental sanitation problems, food control, rodent control, nuisance control, 
industrial hygiene, housing, and special studies. 

There were 32 cities doing routine tabulations by tract. Tabuiated data were 
of the same general type as coded data with primary emphasis on births and 
deaths, and somewhat more limited tabulation of morbidity and health serv- 
ices. Several of the cities of 500,000 population or more were making detailed 
tabulations within the broad groups noted, such as births by prematurity, 
illegitimacy, and hospital; deaths and diseases due to specific causes and by 
various population characteristics, and specific health services such as child 
health conferences, inoculations, clinic attendance, and sanitation activities by 
type. 

Relatively more of the cities of 250,000 population and over than of the 
smaller cities were coding and tabulating data by tract. All but one of the 18 
cities of 500,000 population and over were coding health data by tract and all 
but four of them were routinely tabulating by tract. The variety of data coded 
did not show a consistent increase with population size, however. 

The questionnaire did not ask for a list of publications prepared in each area 
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TABLE 735. TRACT PROGRAM INDICATED IN CITIES AND STANDARD 
METROPOLITAN AREAS, CLASSIFIED BY POPULATION-SIZE 
IN 1950: UNITED STATES, 1958 


(Tabulated from replies to questionnaires) 


Standard Metropolitan 
Areas 


Cities 


Population-Size in 1950 Tracted Percent | Tracted Percent 


and Tract | with and | with 


Queried Program Program | Queried Program Program 


50,000 and Over 153* 97> 63.4 5led 27 52.9 


50,000— 99,999 58 25> 43.1 0 0 0.0 
100 ,000—249 , 999 54 33 61.1 14 6 42.9 
250 ,000—499 , 999 23 21 91.3 11 4 36.4 
500 ,000—999 , 999 13 13 100.0 14 9 64.3 

1,000,000 and Over 5 5 100.0 12 8 66.7 


* Includes eight cities with no reply, four in population group 50,000-99,999 and four in population group 
100,000—249,999. 

> Excludes one city using wards and voting districts, not census tracts. 

© Includes twelve standard metropolitan areas only partially tracted, four of population size 250,000—499,999, 
two of population size 500,000-999,999, and six of 1,000,000 and over population. 

4 Includes two standard metropolitan areas with no reply, one of population size 100,000-249,999, and one of 


population size 250,000-499,999. 


by census tract. A library of publications based on tract data was established 
some time ago by Howard Whipple Green in the office of the Cleveland Health 
Council. This service has been continued at the Bureau of the Census. An 
“Annotated Bibliography” of reports containing data by tract prepared for the 
census period of 1950 was issued by the Census Bureau in 1954 and a second 
edition listing more recent publications is planned. 


4, LIMITATIONS OF TRACT DATA 


It is evident from the replies to the questionnaire that census tract data have 
proved exceedingly useful as a device for studying the characteristics of small 
areas of a city, and changes in these characteristics. Data for these units have 
also served well in city planning, marketing studies, development of health or 
hospital service areas and many other administrative functions in the health 
department. 

As a tool for analytic statistics, some serious limitations of the census tract 
have been described by Foley [9]. Two of the qualifications mentioned are 
the question of homogeneity of characteristics of the population residing in a 
tract, and the meaning of ecological correlations. Both of these problems have 
been discussed in the literature, the first in a paper by Myers [16] and the 
second in a series of papers begun by Robinson [20]. 

Myers tested the homogeneity of census tracts in New Haven by using statis- 
tics for city blocks. He found that 10 out of 28 tracts could be considered homo- 
geneous. The paper by Robinson demonstrates the mathematical differences 
between correlations computed on data for individuals and those computed as 
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TABLE 736. CODING GUIDES AND MAPS BY TRACT AVAILABLE IN 
TRACTED CITIES AND STANDARD METROPOLITAN AREAS, 
CLASSIFIED BY POPULATION-SIZE IN 1950: 

UNITED STATES, 1958 


(Tabulated from replies to questions A-4, 5, and 6 and B-2, 3, and 4 of the questionnaire) 


Cities Standard Metropolitan Areas 


Coding | Tract Area Maps | Coding | Tract Area Maps 
Population-Size Guides Guides 
in 1950 by Tract Gen- | by Tract Gen- 

or Group} Street House erally jor Group} Street House erally 
of Names| NU™-| 4 vail- of Names| NUm™-| 4 vail- 
Tracts bers able | Tracts bers able 


50,000 and Over 26 18> 21 


50 ,000— 99,999 1 
100 ,000—249 , 999 
250 ,000-—499 , 999 
500 ,000—999 , 999 

1,000,000 and Over 


* Exctudes one city with guides available by ward and voting district, not by census tract. 
> Exc.udes one standard metropolitan area with guides available by postal zone, not by census tract. 


measures for a group of persons, such as the population of a census tract. 

While these problems cannot be ignored in evaluating the results of census 
tract studies, it is at least theoretically possible to circumvent them. Goodman 
[10] suggests a simple method of estimating the correlation for individuals from 
the data collected for a series of areas. Duncan and Davis [7] describe a way 
to approximate the individual correlation from area data. These authors and 
others point out, however, that there are times when the correlation among 
variables for areas rather than for individuals is the desired statistic. 

The question of homogeneity of individual tracts becomes less important 
if racts are used as a basis for ecological analysis of urban areas [8]. The 
development of classification of areas based on census tract data is discussed 
in the next section. 


5. SOCIOECONOMIC ANALYSES 


The results of the questionnaire indicated that 43 cities and 14 metropolitan 
areas had grouped tracts by socioeconomic classes. The distribution of these 
localities by population-size is tabulated at the top of the next page. 

Studies of health data for 1950 by socioeconomic groups of tracts were re- 
ported for approximately 20 cities or standard metropolitan areas. About «0 
of the studies were for specific causes of illness or death, ten for tuberculosis, 
five for venereal disease, five for cancer, and the remainder for infant deaths, 
poliomyelitis, mental illness, accidents, and suicides. The relation with socio- 
economic status was also examined for life expectancy, prematurity among 
births, fertility, legitimacy, dental caries and dental care needs of school chil- 
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Tracts Grouped by Socioeconomic Class For: 
Population in 1950 "e Standard Metropolitan 
Cities 

50,000 and Over 43 14 
50,000— 99,999 13 0 
100 ,000—249 , 999 7 2 
250 ,000-—499 , 999 10 1 
500 ,000—999 , 999 8 5 
1,000,000 and Over 5 6 


dren, use of clinics, requests for indigent medical service, medical care of the 
aged, and housing. 

The results of the queries indicated that there had been very few intercity 
comparisons of vital and health data based on socioeconomic characteristics 
by tract. The only comparisons of this type were noted for the cancers, cardio- 
vascular mortality, and cirrhosis of the liver for the cities of Los Angeles, San 
Francisco, and Oakland, California [4]. It has been called to our attention 
that an intercity comparison based on income by census tract has been made 
for cancer morbidity data collected for 10 cities by the National Cancer Insti- 
tute [5]. The dearth of comparative studies may be the result, in part, of lack 
of a suitable single economic scale or index constructed from published census 
tract data that might be applied to all cities. It may also be related to the prac- 
tice of classifying vital and health data by census tract in the city of origin. 
Such codes are not usually transmitted to State or Federal offices, and thus are 
not available centrally. 

The socioeconomic classes described in the questionnaire replies were usually 
determined by ranking tracts according to one or more characteristics for 


TABLE 737. HEALTH ITEMS CODED AND TABULATED BY TRACT 
IN TRACTED CITIES, CLASSIFIED BY POPULATION-SIZE 
IN 1950: UNITED STATES, 1958 


(Tabulated from replies to questions Al, 3, 7, and 8 of the questionnaire). 


Total Cities} Cities Routinely Coding Specified Types of Data basta 
Coding by 
Population-Size Subdivision 
roup Fetal Mor- | Service 
Births | Deaths Othe Special, 
Tracts | Deaths | bidity | Reports | Routine | 
50,000 and Over 65° 58 58 54 44> 26> 21> 32 18 
50 ,000- 99,999 12° 10 10 10 g> 4> 6 3 2 
100 ,000-249 ,999 18 13 13 12 10 5 6 9 1 
250 ,000-499 ,999 18 18 18 16 10 6 5 6 4 
500 ,000-999 ,999 12 12 12 11 11 7 3 9 8 
1,000 ,000 and Over 5 & 5 5 5 4 1 5 3 


® Excludes seven cities coding by local subdivision, but not by census tract—one using district areas, one using 
wards and voting districts, and five in Los Angeles County using unincorporated areas in districts. All of these ex- 
cluded cities were coding births, deaths, fetal deaths, and morbidity, and all but one of them were coding service 
reports or other types of data and making routine tabulations by local subdivision. 

> Includes one “planned.” 
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which data were available in publications of the Bureau of the Census. Since 
the population in each tract is known, the ranked tracts are then divided into 
classes—in many cases to give equal population groups, in others to give equal 
class intervals of the measure used (income, or rent), and in still others to give 
an approximately “normal” distribution of the population. The ranking item 
most frequently selected was median income for the population of the tract, 
used for about 20 cities or standard metropolitan areas. Rent was used for 14 
localities, education for 10, and occupation for eight. Several localities used 
race or nativity, and some applied housing indices other than rent, such as 
value, owner occupancy, age, substandard conditions, plumbing and crowding. 
Other characteristics noted included sex, age, and marital status of the popula- 
tion; family size, population density; number of women working; public assist- 
ance; delinquency; illegitimacy; purchasing habits; number of automobiles; 
zoning; and diseases and deaths. 

In most of the studies reported, the socioeconomic classes have been deter- 
mined by the analyst’s judgment considering only the particular study in prog- 
ress. For a few localities, notably California, more elaborate attempts have 
been made to develop a general classification of the population of cities based 
on the characteristics of the tract of residence. If a satisfactory ecological classi- 
fication of all urban areas based on census characteristics could be developed, 
it is evident that it would be tremendously useful in expanding the range of the 
present city analyses. It would make possible comparative urban research, 
rather than a series of case studies of individual cities [8, 6, 3, 21, 23]. 

Descriptions of several of these classification systems have been published 
[25, 22, 24, 26, 17, 18, 14]. While the methodology applied has differed, these 
indices have many common elements. In most of them an index or scale is 
built up from a number of characteristics of the tract given in the census pub- 
lications. Various statistical devices have been applied to rank or scale the 
items chosen, or weighted combinations of the items. The tracts are then as- 
signed to “type areas” according to their ranking on the combined scale, or on 
several scales used as a coordinate system. 

There has also been some discussion in the literature of the value of the areas 
thus constructed [30, 13]. As far as we know, there has been no attempt to 
use any one of the suggested systems for a great number of cities. From the 
limited point of view of studying variation in mortality (or vital statistics) 
rates for groups of the population, the “social area” concept may represent too 
complex a function. It has been poir‘+d out [13] that a logical frame for the 
social area has not yet been constructed. Interpretation of mortality figures 
for these areas thus becomes quite difficult, unless the analyst resorts to discuss- 
ing the variation of mortality in relation to the individual components of the 
indices used to create the areas. Under these conditions, a simpler scale describ- 
ing only a single dimension, such as economic status, might be more useful. 
Such a scale, designed so that areas of equal status in different cities might be 
compared, would permit assessment of mortality rates between and within 
cities and serve to extend considerably an understanding of conditions produc- 
ing differential mortality. 
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6. CONCLUSION 


The relation between socioeconomic status and mortality has been clearly 
demonstrated in a number of excellent studies. Comparisons have also been 
made for an individual city for several census periods. The series of studies for 
Chicago offer one example [12, 15]. In a great number of these studies median 
income by tract (1950) or median rent by tract (for 1930 and 1940) was the 
major element in establishing a scale for socioeconomic status. But for health 
program applications, we need to be able to go one step further. If we could 
compare mortality for areas of equal economic status in cities across the coun- 
try, it would help in evaluation of other factors in mortality such as air pollu- 
tion, climate, or the impact of city services. For this purpose the present survey 
indicated that there was not yet. an operating solution that could result in such 
mortality analyses for a large number of urban areas. The wealth of studies on 
the nature of population distribution in metropolitan areas according to char- 
acteristics of census tracts holds forth the promise that such methodology, or 
more intensive studies, may be forthcoming when urban data for 1960 are avail- 
able. 


REFERENCES 


{1] American Statistical Association, Golden Anniversary of Census Tracts. Papers pre- 
sented at the Census Tract Conference, September, 1956. 

(2] Batschelet, Clarence E., “Tracting in the United States.” Paper presented in Tor- 
onto, Canada, May 21, 1958, at the Annual Conference of the National Institute of 
Municipal Clerks, U. 8S. Bureau of the Census. 

[3] Buechley, Robert W., “Review of Social Area Analysis by Shevky and Bell,” Jour- 
nal of the American Statistical Association, 51 (1956), 195-7. 

[4] Buechley, Robert W., “Social variables for public health use from census tract sta- 
tistics,” unpublished paper presented at a meeting of the statistics section of the 
Northern California Public Health Assoc‘ation, May 14, 1958. 

{5} Dorn, H., and Cutler, 8., “Morbidity from cancer in the United States,” Parts I and 
II, Public Health Monograph, 56 (1958). Public Health Service, U. 8. Department 
of Health, Education and Welfare. 

[6] Duncan, Otis Dudley, “Review of Social Area Analysis by Shevky and Bell,” Ameri- 
can Journal of Sociology, 61 (1955), 84-5. 

[7] Dunean, Otis Dudley and Davis, Beverly, “An alternative to ecological correlation,” 
American Sociological Review, 18 (1953), 665-6. 

{8] Duncan, Otis Dudley and Duncan, Beverly, “Residential distribution and occupa- 
tional stratification,” American Journal of Sociology, 60 (1955), 493-503. 

[9] Foley, Donald L., “Census tracts and urban research,” Journal of the American Sta- 
tistical Association, 48 (1953), 733-42. 

[10] Goodman, Leo A., “Ecological regression and behavior of individuals,” American 
Sociological Review, 18 (1953), 663-4. 

[11] Green, Howard Whipple, Report of Activities in Census Tract Cities, January 15, 1946. 

{12] Hauser, Philip M., Differential Fertility, Mortality and Net Reproduction in Chicago: 

1930. Unpublished dissertation for degree of doctor of philosophy, University of 

Chicago, 1938. 

[13] Hawley, Amos H., and Duncan, Otis Dudley, “Social area analysis: A critical ap- 
praisal,” Land Economics, 33 (1957), 338-44. 

[14] MacCannell, Earle H., Van Arsdol, Maurice D., Jr., and Schmid, Calvin F., An 
Empirical Evaluation of the Shevsky and Tryon Urban Typologies. Unpublished manu- 


e 
x 
4 
iy 
| 
= 
4 
‘ 


740 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1959 


script on file in the Department of Sociology, University of Washington, July 15, 
1957. 

[15] Mayer, Albert, Differentials in Length of Life. Unpublished dissertation for degree of 
doctor of philosophy, University of Chicago, 1950. 

[16] Myers, Jerome K., “Note on the homogeneity of census tracts,” Social Forces, 32 
(1954), 364-6. 

{17] New York State Department of Mental Hygiene, Technical Report of the Mental 
Health Research Unit, Syracuse: Syracuse University Press, 1956. 

[18] Peters, William S., “A method of deriving geographic patterns of associated demo- 
graphic characteristics within urban areas,” Social Forces, 35 (1956), 62-8. 

{19] Public Health Conference on Records and Statistics, Joint Report of the Working 
Group on Mortality and the Working Group on Natality and Fetal Death, No. 467, 
May 21, 1958. National Office of Vital Statistics. 

(20) Robinson, W. 8., “Ecological zorrelations and the behavior of individuals,” American 
Sociological Review, 15 (1950), 351-7. 

[21] Schmid, Calvin F., “Generalizations concerning the ecology of the American city,” 
American Sociological Review, 15 (1950), 264-81. 

[22] Schmid, Calvin F., Bowerman, Charles E., and Shanley, Fred J., Application of “caie 
Analysis Techniques in Defining Ecological Areas. Unpublished paper on file in the 
Department of Sociology, University of Washington, April, 1952. 

[23] Schmid, Calvin F., MacCannell, Earl H., and Van Arsdol, Maurice D., Jr., “The 
ecology of the American city: Further comparison and validation of generalizations,” 
American Sociological Review, 23 (1958), 392-401. 

[24] Shevky, Eshref, and Bell, Wendell, Social Area Analysis, Stanford Sociological Series 
No. 1, Stanford, California: Stanford University Press, 1955. 

[25] Shevky, Eshref, and Williams, Marilyu, The Social Areas of Los Angeles, Analysis 
and Typology, Berkeley and Los Angeles, California: University of California Press, 
1949. 

[26] Tryon, Robert C., Identification of Social Areas by Cluster Analysis, University of 
California Publications in Psychology 8: 1, Berkeley and Los Angeles, California: 
University of California Press, 1955. 

[27] United States Bureau of the Census, Census Tract Manual, Fourth Edition, Washing- 
ton, D. C.: Government Printing Office, 1958. 

[28] United States Bureau of the Census, U. 8. Census of Population: 1950, Vol. 3, Census 
Tract Statistics, Washington, D. C.: Government Printing Office, 1952. 

[29] United States Department of Health, Education, and Welfare, Directory of Full- 
Time Local Health Units, Publish Health Service Publication No. 118, Washington, 
D. C.: Government Printing Office, 1957. 

[30] Van Arsdol, Maurice D., Jr., Camilleri, Santo F., and Schmid, Calvin F., “The gen- 
erality of urban social area indexes,” American Sociological Review, 23 (1958); 277-84. 


wey 
ANS 


A CHECK ON GROSS ERRORS IN CERTAIN 
VARIANCE COMPUTATIONS 


Hyman B. Karrz 
United States Bureau of Labor Statistics 


In stratified samples, the variance of ratio estimates often includes 
the variance of the variable used for stratification. The upper bounds 
of this variance and the corresponding rel-variance are derived for use 
in checking gross errors, and “typical” values of the variance are also 
presented. 


N SAMPLE surveys of companies or establishments, a common statistical pro- 
I cedure is the following: 

1. The universe of companies or establishments is stratified on the basis of 
several characteristics, including industry and size, as measured by employ- 
ment at a certain period. 

2. Within individual industry-size cells, ratio estimates are obtained for cer- 
tain characteristics by the formula 


Q” 
where Q” is the ratio estimate of the characteristic being measured, Q’ is the 
sample total of the characteristic, B’ is the sample total of employment in the 
benchmark period, and B’” is the benchmark employment total for the cell 
universe. 
The variance of Q” is approximately 


Sq: = 


(Q’")? (Ve? + Va* — 2rVQVz) 


where f is the sampling ratio, n is the sample number, Vg? and V,? are the rel- 
variances,! r is the correlation coefficient between individual values of Q and B 
in the universe, and Q’” is the universe total (estimated by Q’’). 

The present note concerns the variance and rel-variance of B. Within an 
estimating cell, this variable falls within specified class limits. For example, if 
the estimating cell is the 100-499 size class for a given industry, all of the 
sample units will have values of B not less than 100 nor more than 499. 

In consequence, the variance of B can never exceed the value (b—a)?/4 
where b and a represent the upper and lower class limits respectively, and 
should generally not be far from the value (b—a)?/12. Similarly, the rel-vari- 
ance, Vs? cannot exceed the value (b—a)?/4ab.2 These two upper limits ‘are 
partially independent of each other, and can consequently be used together in 
a checking process for gross errors. The “typical” variance, (b—a)*/12 is only 
a rough guide, but should be of some help in an examination of the results. 


1 The rel-variance is the square of the coefficient of variation. 
? For a and 6b with the same sign. In almost all practical cases, however, only positive values of a and b are 
considered. 
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DERIVATION OF THE UPPER BOUNDS 


In general, the variance of a set of values may be written as 
s= Cp=1, p>o (1) 


with mean, 


t= pits. 


z; = (1 — + < l. 


= (1 —Xja+ Xb 


= 


The variance (1) may be rewritten as 
q s? = (b — a)? — (4) 


But 
Dd — i)? = pa? — — 


Consequently 
s? < (b — — X?). (5) 


The right hand expression in (4) is equal to the variance of the two class lim- 
its a and b with weights 1—X and X, respectively. Hence any variance of a set of 
values, with mean #, in the interval has, as an upper bound, the variance of the 
class limits with weights such that the mean is also £. 

The maximum value of all of these upper bounds is readily found to be 
(b—a)*/4 for }=}. 

The “typical” variance, (b—a)?/12 referred to earlier, is obtained by assum- 
ing that a rectangular distribution, f(z) =c, holds for the class interval (a, b). 
The actual distribution is much more likely to exhibit a heavier weighting of 
frequencies toward the lower limit. Under these conditions, the actual variance 
will tend to be a little lower than (b—a)?/12.3 

The upper bound for the rel-variance in the interval is derived in the same 
way as that for the variance. The basic rel-variance is 


> pi(z; — 
— : (6) 
(#)? 


3 For discrete variables, such as size of firm, the variance of the rectangular distribution is, more precisely, 
((6—a+1)?—1)/12. 
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A CHECK ON GROSS ERRORS 


With the aid of (2) and (3), this becomes 
(b — a)? pi(ds — X)* (b — a)*(X — 2?) 


[a-ha + 7) 


The right hand side of (7) is the rel-variance for the two values, a and b with 
weights 1—X and Xi, and mean &, and is an upper bound for all rel-variances of 
quantities having the same mean. : 


The maximum value of these upper bounds is readily found to be (b—a)?/4ab 
for }\=a/(a+b). 
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AUTOMATIC PROGRAMMING FOR AUTOMATIC COMPUTERS* 


O. Locks 
University of California, Los Angelest 

Anyone who attempts to process data or perform computations with 
an automatic digital computer encounters the obstacle of preparing a 
detailed program in computer code and checking this program out on 
the computer. Automatic programming techniques heip reduce the 
magnitude and cost of this task. Complex statistical and data processing 
problems can be programmed with little or no knowledge of the com- 
puter code. An example is given of an automatically programmed one 
way analysis of variance F computation. 


1, DESCRIPTION OF AUTOMATIC PROGRAMMING 


EDIOUS preparatory work is necessary to treat a statistical or data process- 
Tine problem when using an automatic electronic digital computer. Thor- 
ough analysis is necessary to describe the method by which the operations are 
to be performed or the equations solved. It is usually advisable to develop a 
flow chart or charts of the process. A program in computer code must be pre- 
pared. This program must then be debugged (tested out) on the equipment 
under simulated operating conditions. Coding and debugging are usually very 
laborious because the equipment cannot perform properly if there are any errors 
in the program. 

This work presents a dilemma to the analyst such as the statistician. If he 
attempts to program the equipment he may spend too much energy and time 
in coding and debugging. This time can often be used for other purposes <uch 
as developing solutions for other problems. If the assistance of programmers is 
used, a risk arises that poor communication between the analyst and his as- 
sistants may make it difficult to obtain a satisfactory solution. 

The purpose of automatic programming is to reduce this barrier between the 
user and the computer. If offers simplified methodology such that, hopefully, 
the user is able to develop his own solution without learning the detailed com- 
puter code or employing programming assistance. Frequently-used types of 
computations are coded and inserted into a library of subroutines stored in the 
computer’s memory. These subroutines can then be extracted from the library 
and used by means of simple words or statements known as pseudo-codes. By 
preparing pseudo-code statements in a desired sequence, the analyst determines 
the manner in which those subroutines operate upon his data in the running 
program. 

An example of a widely used subroutine present in automatic programming 
systems! is arithmetic multiplication. A complete subroutine to compute and 


* The work was performed while the author was associated with the Remington Rand Univac Division of the 
Sperry Rand Corporation at St. Paul, Minnesota. 

t This report is based on a paper delivered on September 10, 1957 at the Atlantic City meetings of the American 
Statistical Association. The author gratefully acknowledges the assistance of Grace M. Hopper and Charles Kats of 
Remington Rand Univac, who read earlier drafts of this paper and made suggestions for substantial improvement 
in content. 

1 In the computing industry, the word “system” is often used in different ways. Usually it refers to a self- 
contained combination of computing hardware. In the context of automatic programming, it refers toa self-contained 
combination of automatic programming devices that operate on a particular (hardware) system. The word system 
as used in this paper refers to an automatic programming system rather than a hardware system. 
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store the product of two numbers may require from three to 50 computer in- 
structions.? Without automatic programming the vser must code this subrou- 
tine if he wishes to perform multiplication. If an automatic programming sys- 
tem is used, a single pseudo instruction (e.g., the word “MULTIPLY”) may be 
all that is necessary to cause this product to be computed and stored. 

It is axiomatic that the closer the pseudo-code resembles ordinary language, 
the easier it is to learn.* Likewise, the smaller the number of pseudo instructions 
that have to be used, the easier it will be to code a problem. 

Following is a description of an automatic programming system. Six different 
magnetic tape reels are mounted on six magnetic tape handling devices at- 
tached to the computer.‘ One reel contains the pseudo-code statement of the 
method of solution for the problem (including some computer code if the prob- 
lem or system requires it); the second has a master routine which controls the 
interpretation of the pseudo-code and the assembly of subroutines; and the 
third has the library which contains the complete set of error-free subroutines. 
Of these three reels, the user prepares only the first. The master and library 
routines have been previously coded, presumably by the group of professional 
programmers who developed the automatic system. 

The master routine interprets the pseudo instruction and withdraws the 
needed subroutines from the library in the desired sequence. It causes a running 
program in computer code to be assembled and written onto a fourth reel.’ The 
fifth reel contains the input data (i.e., the data supplied by the user and upon 
which the computations and processing will be performed). The running pro- 
gram operates upon these input data to produce the output data, which are in 
turn written onto a sixth reel. These output data may then be reproduced 
visually on the desired output medium (e.g., Typewriter, High Speed Printe~) 
according to the coding. 

There are three basic types of automatic techniques—the Interpretive, As- 

‘sembly and Compiling Systems. Following are “typical” descriptions of each. 


2 Multiplication requires a minimum ber of puter instructions only if the number of dignificant digits 
and the position of the decimal point in the multiplicand and multiplier are the same for each mul(iplication every 
time it is performed. Many statistical and scientific calculations do not meet these conditions. A large number of 
the most significant digite must be retained and the locations of the decimal points in the operands remembered. 
This makes it necessary to shift and scale quantities to position the most significant digits and use an additional set 
of instructions to locate the decimal point. Under these conditions a maximum number of instru¢tions (about 50) 
must be used for the complete process. however, some equipments have the “floating point” feature by which all 
decimal points are automatically positioned in arithmetic computations. If these equipments are used, this addi- 
tional extensive programming is not necessary. 

8 The problem of coding a computer may be compared to the use of a language. There are several interesting 
illustrations of similarities. For example, combinations of binary digits constitute the alphabet; the instructions 
which perform transitive actions, such as arithmetic operations or the transfer to data, constitute the verbs; the 
instructions which transfer program control based on predefined ditions are the conjunctions; and the data upon 
which the instructions operate can be considered as the nouns and adjectives. Since the ultimate purpose of all lan- 
guage is communication of ideas and processes, coding may be compared to the communication to the computer 
in its own language of a proposed method of operation on data. Likewise, the automatic code is a meta-language 
intermediate between the language of the user and that of the computer. Further discussion of this point is beyond 
the scope of this paper. 

4 This general description uses as a model a type of automatic programming system called the “compiler” as it 
might operate on a computer with a multiple number of magnetic tape units. It was used for this illustration because 
compilers, being the most advanced type of automatic programming, contain the basic elements present in all 
other types. Automatic programming systems will differ from this description because of the type of automatic 
programming, the alternative choices of input and output equipments available, and the nature and size of the 
computer’s memory. 

5 Automatic programming systems differ from one another chiefly with respect to the manner in which the 
running program is assembled. This point is elaborated upon further in the discussion. 
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In an interpretive or “running translation” system the running program is 
not assembled before the solution is initiated. Each pseudo instruction is inter- 
preted and executed before the next pseudo instruction is interpreted. Thus the 
library subroutines are referred to as the running program operates upon the 
input data. 

Since each subroutine may be used several times for each item of input data, 
there may be a large number of calls to the library. This will tend to cause ex- 
cessive movement of the magnetic tapes, particularly the library tape, in search 
for the subroutines. Thus the over-all computation time is increased, while the 
computation facilities are idle. It has been pointed out that interpretive sys- 
tems are most feasible in computers that have only a small amount of general 
storage.® 

Assembly and Compiling Systems both obey the “pre-translation”’ principle. 
Pseudo instructions are interpreted and a running program is produced before 
the solution is initiated. Usually this makes possible a single set of references to 
the library rather than many repeated references. 

In an assembly system the pseudo-code is ordinarily modified computer code. 
Each pseudo instruction refers to one machine instruction or to a relatively 
short subroutine. Under the control of the master routine, the assembly system 
sets up all controls for monitoring the flow of input and output data and in- 
structions. 

A compiler system operates in the same way as an assembly system, but does 
much more. In most compilers each pseudo instruction refers to a subroutine 
consisting of from a few to several hundred machine instructions.* Thus it is 
frequently possible to perform all coding in pseudo-code only, without the use 
of any machine instructions. 

From the viewpoint of the user, compilers are the more desirable type of 
automatic programming because of the comparative ease of coding with them. 
However, compilers are not available with all existing equipments. In order to 
develop a compiler, it is usually necessary to have a computer with a large sup- 
plementary storage such as a magnetic tape system or a large magnetic drum.’ 
This storage facilitates compilation’® by making possible as large a running pro- 
gram as the problem requires. 

Examples of assembly systems are Symbolic Optimum Assembly Program- 
ming (§.0.A.P.) for the IBM 650 and REgional COding (RECO) for the 
UNIVAC SCIENTIFIC 1103 Computer. The X-1 Assembly System for the 
UNIVAC I and II Computers is not only an assembly system, but is also used 
as an internal part of at least two compiling systems." 


* James H. Brown and John W. Carr III, ‘Automatic programming and its development on the MIDAC,” 
John W. Carr and Norman R. Scott (ed.). Notes on Digital Computers and Data Processors, University of Michigan 
1956 Summer Session, I1.4.1., p. 2. 

7 The terminology of “running translation” representing interpretive systems and “pre-translation” represent- 
ing assembly and compiler systems is adapted from Brown and Carr in the article quoted in the previous footnote. 

* For a description of a typieal compiler refer to that portion of the text tioned in footnote 5. 

* With fine programming, compilers have been developed for a computer with a magnetic drum of 2,000-word 


capacity and no other internal storage. Some lysts feel that a larger volume of storage is necessary to obtain 
the full flexibility of a compiler. 

1° The process of bling a lete running program is called “compilation” (hence the name compiler). 
The it of put operating time required to compile the running program is called “compiling time.” 


1 Foley, Stanley and Mitehell, Grace E., Symbolic Optimum Assembly Programming. New York: International 
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Some highly advauced or “intelligent” compilers have been developed which 
will make computers more usable by non-programming users such as statisti- 
cians. This can be attributed to the trend to develop problem-oriented pseudo- 
codes. 

For scientific and mathematical calculations, three compilers which translate 
formulas from standard symbologies of algebra to computer code are ayail- 
able for use with three different computers. These are the MATTH-MATIC 
(AT-3) System for the UNIVAC I and II Computers, FORTRAN (for FOR- 
mula TRANslation) as used for the IBM 704 and 709, and the UNICODE 
Automatic Coding System for the UNIVAC SCIENTIFIC 1103A Com- 
puter.” 

Two advanced compilers have also been developed for use with business data 
processing. These are the FLOW-MATIC (B-ZERO) Compiler for the 
UNIVAC I and II Computers and REPORT GENERATOR for the new 
IBM 709." In these compilers, English words and sentences are used as pseudo- 
code. 

General purpose automatic programming systems are built for a wide range 
of usage by many types of users. For this reason they tend to suffer from poor 
adaptability to many specific types of problems, the need to employ a rela- 
tively large number of computer instructions for each subroutine, and ineffi- 
cient usage of internal storage. The result might be an inefficient running pro- 
gram requiring an excessively long time to perform calculations or data-proc- 
essing. 

One of the methods of solving this problem is the construction of special- 
purpose automatic programming which can be best applied to particular classes 
of problems. An example of one of these systems that should be of interest to 
many statisticians is the Matrix-Math Compiler developed by the Franklin 
Institute in Philadelphia. This compiler was constructed especially for per- 
forming matrix algebra computations on the UNIVAC 1 Computer. In this 
system there are pseudo instructions for operations such as Matrix Inversion, 
Matrix or Scalar Multiplication, Matrix Multiplication and Addition to the 
Identity Matrix and Transposition. 


Business Machines Corporation, 1956. Remington Rand Univac, “Regional Coding Routine II (RECO II)— 
Routine RR126” Univac Scientific Central Exchange Newsletter No. 9P X-71900-9, St. Paul: 9 April 1956. Automatic 
Programming Development Group, X-1 Assembly System, New York: Remington Rand Univac, Division of Sperry 
Rand Corporation, 1956. 

12 UNIVAC Math-Matic Programming System, Remington Rand Univac Division of Sperry Rand Corpora- 
tion, 1958. Applied Science Division and Programming Research Department, International Business Machines 
Corporation, The Fortran Automatic Coding System for the IBM 704 EDPM, Programmers’ Reference Manual 
(New York: International Business Machines, 4 October 19%). UNICODE Automatic Coding for Univac Scientific 
Data Automation System 1103A, Remington Rand Univac Division of Sperry Rand Corporation, 1958. 

13 Flow-Matic Programming System, Remington Rand Univac Division of Sperry Rand Corporation, 1958. It is 
worth noting that the same compiler is used in this case on two different computers with substantially different com- 
mand structures. This illustrates the fact that the language of a compiler can be independent of that of the con- 
puter. 

4 Some automatic programming systems are able to construct fairly efficient running programs. This is par- 
ticularly true of assembly systems developed for use on computers which employ magnetie drums or other delayed 
access devices for operating memories. By proper positioning of the instructions of the running program in the 
memory, the amount of time needed to access these instructions can be optimized (minimized). One manufacturer 
claims that a compiler developed for use on one of its large computers can produce programs “nearly as efficient as 
those written by good programmers.” 

4% William McKay, The Matrie Math Compiler for Univac I, The Franklin Institute Laboratories for Research 
and Development—OPM 2499, November 15, 1957. 
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Automatic programming requires as much precision in preparing pseudo- 
code as does non-automatic programming in equipment code. A misplaced 
symbol or character can obviate a solution. With some compilers this risk is 
minimized with built-in schemes which detect apparently obvious errors, such 
as omission of data or errors in data identification. 

Automatic programming also introduces some difficulty in debugging be- 
cause debugging must be performed on the computer with some information 
in equipment code. Conceptual and developmental work is underway to have 
debugging performed in pseudo code."* 


6. ILLUSTRATIVE PROBLEM 


The purpose of the example below is to show how automatic programming 
can simplify the process of coding a statistical application for a computer. The 
MATH-MATIC compiler system is used to develop a program which will 
compute the Snedecor F statistic of the one-way Analysis of Variance.!? Chart 
749 gives the pseudo-code program for this computation. 

This program is based on formula (1) which is fairly well-known. 


( Xu) 


K N x 2 jul inl 
K(N 1) = s) K (1) 
raf K N K/N 2 
jul inl jal \ iol 


The pseudo-code is entered on the appropriate tape reel. The process of 
compilation results in a running program which will process the input data, 
compute F, and display it on another tape reel. The operator types in the 
values of N and K at the control console before the processing begins. 

The sequence of operations during the processing of the data is a successive 
series of accumulations to sums of values and sums of squares. These accumula- 
tions as defined in the program are: 


= 


S; = 


j=l 


% Charles Kats, “Systems of Debugging Automatic Coding,” Automatic Coding, Journal of the Franklin Inati- 
tute Monograph No. 3, Proceedings of-a Symposium held Jan. 24 and 25, 1957, at the Franklin Institute of Phils- 
delphia, April 1957. 

11 This method is frequently used in analysis of the results of experiments for which N random values are ob- 
served in each of K different groups. By reducing the data to an F statistic and computing its value to use in an “F” 
probability distribution, a judgement is made as to whether some cause or pure chance alone might have accounted 
for the variations between grouped values. See Snedecor’s Statistical Methods for a ber of illustrations of the uses 
of this statistic. 
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Using these definitions the formula for F reduces to (2) 

K-1 N-S,-—S; 

The pseudo-code program of Chart 749 is developed in such a manner that 
as the input data are processed, their values and the squares of their values are 
added to the S sums in the desired sequence. When all accumulations are com- 
plete, F is computed by formula (2). The pseudo-code program of Chart 749 is 


relatively easy to follow with a small amount of additional explanation which is 
given below. 


CHART 749 
PSEUDO-CODE PROGRAM FOR COMPUTATION OF F STATISTIC IN 
ONE-WAY ANALYSIS OF VARIANCE USING “MATH-MATIC” (AT-3) 
COMPILER 
(1) TYPE-IN N, K. 
(2) S1=0. 
(3) S2=0. 
(4) .S3 =0. 
(5) VARY J 1 (1) K SENTENCES 6 THRU 12. 
(6) SO=0. 
(7) VARY I 1 (1) N SENTENCES 8 THRU 10. 
(8) READ X, IF SENTINEL, JUMP TO SENTENCE 13. 
(9) SO=X +30. 
(10) S1=X?+45S1. 
(11) S2=S0+S2. 
(12) S3 =S0?+3. 
(13) F =(K*(N —1)/(K —1))*(S2—S2*/K)/(N*S1—S3). 
(14) PRINT-OUT F, K, N. 
(15) STOP. 


The data to be reduced to an F statistic (the X;;’s) are entered onto the input 
data tape reel by group. The end of the data is indicated by twelve Z’s 
(ZZZZZZZZZZZZ) which is called the sentinel.'* 

The first seven statements or “sentences” in the program set up the condition 
for processing each unit of input data. Sentence (1) enables the operator to type 
in the values of N and K. Sentences (2), (3), and (4) insure that the accumula- 
tions of S,, S2, and S; all start at zero at the outset. 

Sentence (5) sets up a “loop” or major subroutine which is repeated for each 
of the K groups starting with the first group and advancing in stages of one 
group at a time. Sentence (6) clears So to zero at the beginning of each of these 
K loops. Sentence (7) sets up a loop for reading each of the N observations 
within each group. 

Sentence (8) is the beginning of the actual data reduction. It enables the 
reading (from the input data tape reel) of the next value and testing to deter- 
mine whether it is the sentinel. If it is not the sentinel, accumulations are made 
to Sy and S, by sentences (9) and (10). When all of the N values for a group have 


16 Sentinels are needed in computer operation to indicate end of data, end of file and of tape reel, or any other 
point of differentiation between one type of data and another. Twelve Z’s are often employed for this purpose be- 
cause it is unlikely that anyone would use them for data or any other purpose in programming. 
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been entered and reduced, accumulations are made for S; and S; by Sentences 
(11) and (13). This process is repeated for each of the K groups. 

If the 12 Z sentinel is the item, this indicates the end of the data. The next 
step is to compute F’. This is all performed under the control of sentence (13). 
This is a remarkable instruction because it is simply a statement of formula 
(2).%° It is worth noting that on the basis of this single instruction three multi- 
plications, four subtractions, and three divisions are all performed. In addi- 
tion, all significant digits are properly positioned and all decimal points evalu- 
ated with no additional coding. 

Sentence (14) instructs the control typewriter to print out the values of the 
statistic F, and the constants K and N. Sentence (15) is the instruction that 
stops all calculations and rewinds the magnetic tapes. 

This problem required about four working days of preparation. This in- 
cluded learning to use the compiler, planning input data layout, and flow 
charting. Only two hours were required to prepare the pseudo-code. The 15 
sentences of pseudo-code correspond to a running program of approximately 
1,000 computer code instructions.”° 

About four minutes of compiling time was required (i.e., for the computer to 
prepare the running program). The running time for the computation itself 
was negligible. The program ran without errors the first time it was tried. Thus, 
no debugging time was required. 

For a set of test data with 30 observations and K=3 and N =10, the run- 
ning time was about three seconds. Extrapolation indicates that it would take a 
very large number of observations to require a substantial running time, such 
as a minute. 


3. COMPARATIVE COSTS 


The cost of computing a statistic or set of statistics depends upon the amount 
of data reduced. The purpose of this section is to compare the costs of comput- 
ing by means of automatic programming with those of alternative methods. 
Hypothetical costs are estimated for computing the Snedecor F three different 
ways, as follows: 

Method I: Desk Calculator 

Method II: Computer with Hand Coding (i.e., without Automatic Program- 
ming)! 

Method III: Computer with Automatic Programming Employing a Compiler. 

For simplicity, linear cost functions will be assumed. The results of the 
illustrative problem of the preceding section will be used.” 

Under Method I the only cost incurred is clerical assistance to operate the 


19 The asterisk in the formula is used as a multiplication sign. 

20 In a similar application using the A-2 compiler system on the UNIVAC computer, Grace M. Hopper reported 
that a multiy’a-correlation problem with one dependent and four independent variables required 198 A-2 pseudo- 
code operations, 8 minutes of compiling time, and produced approximately 2300 instructions. (From John W. Carr 
and Norman R. Scott op. cit., II 4.4, p. 2). 

% For most computers today it is unlikely that it would be necessary to code statistical problems completely 
without the aid of some type of automatic programming. Most computers have interpretive and assembly systems 
or other programming aids, such as a library of subroutines. 

% Since these are largely hypothetical data applied to one type of problem as it might be performed on a par- 
ticular computer, no set of assumptions will be completely satisfactory. However, the method of estimating costs 
is more important in this illustration than the actual figures themselves. 
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calculator. It is assumed that an operator being paid $2.00 per hour can process 
the data at an average rate of fifty observations an hour. 

Under Method II two costs are considered—professional programming serv- 
ices and computer time. Programming service charges are estimated at $3.50 
per hour, and computer time at $300.00 per hour, or $5.00 per minute. 

Under Method III it is assumed that the user programs in pseudo-code at an 
“opportunity cost” (i.e., value of his services in alternative uses for the same 
time) of $5.00 per hour. Computer operating costs consist of charges for com- 
piling time and program running time.” Since both are essentially computer 
operating time, both cost $5.00 per minute. 

Under Method I no fixed charges are incurred. The operator makes all of the 
calculations at the rate of fifty observations per hour. With an hourly charge of 
$2.00 the “per observation” cost is $0.04. 

Under Method II programming services rendered are independent of the 
amount of data reduced. It is estimated that it would require two working weeks 
of 80 hours of programming, coding, and debugging to prepare the 1,000 in- 
structions of the running program. At $3.50 per hour this represents a fixed 
cost of $280.00. 

Computer time requirements estimated under Method II are based on the 
assumption that the computer can process and perform all of the computations 
for 1,000 observations a minute.” At $5.00 per minute rental this represents an 
incremental cost of $.005 per observation, or }th of the desk calculator. 

Under Method II two hours were used to prepare the pseudo-code solution. 
This represents a fixed cost of $10.00 at $5.00 per hour. Compiling time on the 
computer was four minutes, representing a fixed cost of $20.00. Thus, total 
fixed costs are $30.00. The running time for the input data is estimated to be 
the same as for Method II, so that the incremental cost per observation is as- 
sumed to be $.005.” 

Tables 752a and 752b below summarize these figures. Table 752a shows the 
amount of personnel or computer time required under each of the three meth- 
ods by category. Table 752b reduces these time estimates to dollar costs. The 
cost estimates for Methods II and III are based on a single usage of the pro- 
gram. However, a program once prepared can always be re-used. 

When a hand-coded, program (Method IT) is used a second time, the only 
costs incurred are the incremental (per observation) costs. The fixed cost of 
$280.00 for programming is eliminated. However, when automatic program- 
ning is used (Method III), compiling time may still be necessary, even though 


% Debugging time on the computer should also be considered. However, it was omitted from these calculations 
for simplicity. 

% This is probably not a constant per unit (i.e., linear) cost function. In actual fact, both the incremental and 
average per unit cost should decrease as the number of observations increases to reflect the influence of learning 
repetitive operations, However, this should not materially affect the conclusions developed. 

% The UNIVAC I Computer can process data at a maximum rate in excess of 20,000 observations per minute 
when the input data are entered in the format referred to as “two word floating decimal” which is employed in cer- 
tain compilers. The rate of processing may be slower than this maximum when there is a large volume of computa- 
tion and input data are prepared on a magnetic tape with low initial density of recording. It would appear that an 
average processing rate of 1,000 observations per minute or less than 1/20th of this maximum is a conservative esti- 
mate for our purposes. 

% As was discussed above, there is reason to believe that an automatically-developed program might operate 
at 8 somewhat smaller efficiency than a corresponding hand-coded program. Thus incremental cost per unit 
of data processed might be higher for the automatic program. 
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TABLE 752a. TIME ESTIMATES FOR COMPUTATION 
OF “F” STATISTIC 


Method III 


Method II 


Method I Hand coding of Automatic programming 
on desk automatic computer of automatic computer 
insample  calculator* ‘ 
(hours) Coding ayaa 8 Coding Compiling Running 
time’ time time> 
(minutes) (minutes) (minutes) 


® Average rate of fifty observations per hour. 
> 1,000 observations processed per minute (or approximately 17 per second) on a computer with magnetic tape 
input. 


TABLE 752b. COSTS FOR COMPUTATION OF ‘F” STATISTIC* 


Method II Method III 


Number Method 
in sample Cod- Run- Total Cod- Com-  Run- 
ing? ning? ing’ piling® ning* 


Total 


4.00 -50 $280.50 $10.00 $20.00 -50 $30.50 

500 20.00 280. 2.50 282.50 10.00 20.00 2.50 32.50 
1,000 40.00 280. 5.00 285.00 10.00 20.00 5.00 35.00 
2,000 80.00 280. 10.00 290.00 10.00 20.00 10.00 40.00 
5,07 200.00 280 25.00 305.00 10.00 20.00 25.00 55.00 
50.00 330.00 10.00 20.00 50.00 80.00 


® Based on Table 752a. 
> Assumed cost of $2.00 per hour for clerical labor. 

© Assumed cost of $3.50 per hour for programming assistance. 
4 “Opportunity cost” of statistician’s time at $5.00 per hour. 
© Based on large computer rental cost of $5.00 per minute. 


the pseudo-code itself does not have to be prepared again. Thus the $20.00 
charge for compiling time must be incurred the second time the program is 
used.?” 

Both of these situations are depicted in Chart 753 as Methods IIa and IIIa, 
respectively. 

If dollar costs alone are considered, for less than 857 observations and 
“single” use programs, the desk.calculator is the cheapest method. For more 
than 857 observations the automatically programmed computer is most eco- 
nomical. Thus 857 observations in this illustration may be called the “break- 
even” point for automatic programming as compared to desk calculation. 


27 Although it cannot be done in this case because of the nature of the MATH-MATIC compiler system, it is 
theoretically possible to develop a compiler for which a pseudo-code can be prepared in such a way that a repetition 
of compiling time is unnecessary the second time the program is used. 


752 
100 2 2 2 4 A 
500 10 2 5 2 4 5 or 
: 1,000 20 2 1.0 2 4 1.0 es 
2,000 40 2 2.0 2 4 2.0 a 
5,000 100 2 5.0 2 4 5.0 aS 
{ 10,000 200 2 10.0 2 4 10.0 a 
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. VATIONS 


280 
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20 


1000 2000 3000 4000 5000 6000 7000 8000 
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Hand coding is more expensive than is automatic programming and is not 
economical to use if the program is to be used only once. However, it may be 
the more efficient method if the program is to be used often enough. 

Thus, even though there are certain fixed costs when a problem is performed 
by a computer, every problem has a “breakeven point,” a level of activity 
above which the use of a computer is justified. Since automatic programming 
reduces these fixed costs drastically, the computer may be used at a lower level 
of activity than is feasible with other methods of programming. 


4. ANALYSIS AND SUMMARY 


Automatic programming is clearly a useful tool for statisticians for enabling 
them to have computations and data processing performed on automatic 
digital computers. If reduces the psychological barrier to the computer by 
simplifying coding. It reduces the elapsed time between conception of the prob- 
lem and its ultimate solution. By so doing it also reduces the cost of the pro- 
gramming effort. By requiring a small amount of coding, it minimizes the risk 
of errors in the program and computation, and cuts debugging. 

Compilers require the least amount of pseudo-code for any particular opera- 
tion and therefore seem the most attractive to statisticians. However compilers 
are not available for use on all computers. In addition, many general-purpose 
compilers tend to be more satisfactory for certain classes of problems than 
others. 

It is not possible to determine how automatic programming has been useful 
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to the statistics profession. When more statisticians operate and become 
familiar with computers, some empirical evidence might become available of 
their usage of automatic programming. Current trends in automatic program- 
ming make it possible to offer some suggestions. 

Those statisticians who reduce sample and survey data should find the Eng- 
lish language-type of compiler for data-processing useful. Those who perform 
primarily mathematical computations could make best use of the formula 
translation type of compiler. However, if matrix algebra is to be performed, it 
is best to use routines or compilers developed for that specific purpose. 

Assembly and interpretive systems are available with most computers and 
can be used in many situations where compilers are not available. Their useful- 
ness depends upon both the nature of the computer and the work to be per- 
formed. 

The author suspects that most statisticians who have any familiarity with 
computers and automatic programming harbor a desire for a statistical compiler. 
In this type of compiler pseudo-code statements could represent frequently- 
used statistics. This could include the mean, mode, median, and standard 
deviation among many others. Likewise, sampling and inferential statistics and 
measures used in multivariate analysis could be stated in pseudo-code. Since 
many calculations are common to all of these measures, there is no conceptual 
reason why this cannot be performed in one compiler system. 

The future points to the possibility of an automatic programming meta- 


language which might be used for all computers and for all applications. Other 
anticipated developments include automatic operating and automatic debug- 
ging. Such developments would simplify the writing of a paper like this one. 


| 
a 
| 
‘ 
f 
£ 
- 


MATRIX INVERSION, ITS INTEREST AND APPLICATION 
IN ANALYSIS OF DATA* 


B. G. Greenperc anp A. E. Sarwan 
University of North Carolina 

Matrix inversion is used in the least squares analysis of data to esti- 
mate parameters and their variances and covariances. When the data 
come from the analysis of variance, analysis of covariance, order sta- 
tistics, or the fitting of response-surfaces, the matrix to be inverted 
usually falls into a structured pattern that simplifies its inversion. 

One class of patterned matrices is characterized by non-singular sym- 
metrical arrangements in which linear combinations of the first (r —1) 
rows provide the right-hand portion, starting with the elements on the 
principal diagonal, of the rth and remaining rows. That is: 


+ + + = Vij 


for r Si Sj, with v;; =0;; for all i +7. The inverses of matrices of this class 
contain a non-null! principal diagonal, and immediately adjacent to the 
principal diagonal, (r—1) non-null superdiagonals and (r—1) non-null 
subdiagonals. All other elements are zero. These inverses are called 
diagonal matrices of type r. That is, a matrix is diagonal of type r if 
ai; =0 for |i—j| 2r and ay; =a, for all When r=2, the inverse is 
easily written in terms of 0; and »;;. A general procedure for obtaining 
the inverse when r =3 is given. 

The resuits for r=2 are illustrated by a problem in order statistics 
using the two-parameter exponential distribution. 

Patterned matrices also are amenable to partitioning and this is 
another convenient device to find the exact inverse quickly. In fitting 
a response-surface to data, for example, a complicated-looking matrix 
can be abbreviated and simplified by selective partitioning. When this 
has been done, the exact inverse can be found by equating the product 
of the matrix and its inverse to the elements of the identity matrix. 
This procedure is illustrated with some data from a problem in the fit- 
ting of a response-surface. 

When the matrix has no special pattern, as in the usual regression 
problem, the recommended procedure for matrix inversion is the modi- 
fied square root method. 


1, INTRODUCTION 


HIS paper is concerned with the inversion of a class of matrices with special 
patterns as well as the numerical inversion of matrices in general. In a sub- 
sequent paper, the concepts will be extended to the case where a submatrix is 
to be inverted after the whole matrix has been inverted. In both papers, the re- 
sults will be applied to problems in analysis of data where such matrix inver- 
sion is applicable. 
Patterned matrices occur frequently in least squares solutions for estimating 
_ population parameters, in analysis of variance, and in response-surface fitting. 
Inverting sections of matrices that have been inverted may be required not 
only when least squares estimates are revised after elimination of the nonsig- 
nificant variables but also when estimating parameters from censored samples. 


* Sponsored by the Office of Ordnance Research, U. 8. Army. Revision of a paper presented before 31st Ses- 
sion, International Statistical Institute, in Brussels, Belgium, on September 5, 1958. Travel to Brussels was made 
possible by a grant from the American Statistica! Association. 
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The matrices considered here are always symmetric square matrices. 

The literature describing methods for the inversion of matrices, both in gen- 
eral and special cases, has been condensed in Dwyer [1]. A more recent survey 
of special methods is given in Householder [3]. The present paper provides 
some addition to and a clarification of some of these methods. 


2. DIAGONAL MATRICES OF TYPE r 


Define a matrix as a diagonal matrix of type 1 when it contains non-null 
elements along the diagonal and zeros elsewhere. A matrix with non-zero ele- 
ments along the main diagonal and one non-zero diagonal of elements immedi- 
ately above and below the main diagonal is a diagonal matrix of type 2. Sim- 
ilarly, an (n Xn) matrix is a diagonal matrix of type r when it consists of (r —1) 
superdiagonals and (r—1) subdiagonals containing non-zero elements around 
the main diagonal, where r=1,2 - - - , n. In general, a diagonal matrix of the 
r-th type satisfies the condition a;,=0 where |i— j | >r. 


38. THEOREM 


Ukita [10]! proved that a necessary and sufficient condition for the inverse 
of a symmetric matrix to be a diagonal matrix of type 2 is that each element 
of the original matrix in the j-th row (starting at the main diagonal and pro- 
ceeding to the right) must have a constant relation with the corresponding 
columnar element in the first row. That is, if there is a symmetric V matrix, 
with elements v,,, then the necessary and sufficient condition for V— to be a 
diagonal matrix of type 2 is: 


Thus, 
= dy, 
where \ depends only upon the row, j, where j=2, 3, - - - , n and kis a positive 
integer, 7<k<n. 
In section 3 of Sarhan and Roy [6], the investigators proved that when a 


1 We wish to acknowledge and to thank one of the referees for directing our attention to the same result given 
by Guttman [2]. 
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matrix can be expressed in a specific form related to the theorem above, its in- 
verse would be a diagonal matrix of type 2. This result can now be generalized 
and simplified. 

Consider a symmetric matrix with »; as the elements of the first row and 
Vrj=rlij(j>r) as the elements of the r-th row. The inverse is a diagonal ma- 
trix of type 2 with elements as follows: 


(r > 2) (3.1) 
and 
1 
= (3.2) 
— 
For r=1, 


= — — (3.3) 


4, EXAMPLES 


Example 1. The symmetric matrix V, given below, is the major part of the 
variance matrix of the ordered observations in samples of size n from the rec- 
tangular distribution as given in Lloyd [4]. The inverse of this portion is re- 
quired when estimation of population parameters is required. The matrix, 
[v.;], can be written for all i<j, and the remainder filled in by symmetry. 


2(n—1) 2(n—2)---2 

V = [os] = [in = 3(n —2)---3 
(symmetric) 

nJ 


From V, it can be seen that 


Vik ‘ 

where j<k. 

Vik 
Therefore, the inverse is a diagonal matrix of type 2 with its elements ex- 
pressed as: 


2 
ali 1 1 


Therefore, V- is expressed as 
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Example 2. If %;=v,; for (k<j, 7’), and the matrix is symmetric, then, in 
addition to the fact that v,,=1,m; for r<j, the following also hold: 


Vr 

= 
Vi,r-1 V1,r-1 
Vr-1,r 
= 


Vir 


= 


This means that (3.1) and (3.2) ean be reduced as follows: 


and 


Ur = 


-1 


| 


1 


A1,r) 


for r > 2. 


The symmetric matrix V, given below, is the variance matrix of the ordered 
observations in samples of size n from an exponential distribution as given in 
Sarhan [7]. The inverse is required when estimation of population parameters 
is desired. 


ce. 4 1 1 7 


nn? n? n? 
2 1 : 1 2 1 
(n—i+1)? 
V= 1 1 
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In this case, 
Vij 
Ai = = 
Vij 
for a fixed i (¢(<j=1, 2, ---, mn). Therefore, 


tr 


and 
(n—r+ 1). 
Therefore, 
—(n—1)? 0 0 07 
(n —2)?+(n—1)? —(n—2)? 0 0 
(n —3)*+(n —2)? —(n—3)? 0 
(n —4)?+(n—3)? eee 0 
| 


To demonstrate the usefulness of having this inverse, consider the two- 
parameter exponential distribution 


=— 


where 


and 


a=measure of location (smallest theoretical observation) 
o=measure of dispersion 
w=a+o=mean. 


As given in Sarhan and Greenberg [8], by using least squares and the inverse 
of the variance-covariance matrix given here, a sample of size n will have as its 
best estimates: 


= — y 
n—1 
PG) — ya) where Ya) = smallest observed value 
u* =a* + = 4, 


These results may be applied to an experiment in which eight rabbits were 
inoculated with Treponema pallidum. The observed periods of incubation were 


. 
a<y< 
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recorded in days: 11, 12, 13, 14, 11, 38, 14, and 15. After rearranging the incuba- 
tion days in size order, the values for a*, o*, and u* are obtained as follows: 


_ 8(11) — 16 
7 
8(16 — 11) 
u* = 16 days. 


= 10.29 days 


5.71 days 


5. TYPE fr DIAGONAL INVERSE MATRIX ' 


A sufficient? condition that a square symmetric matrix will have an inverse of 
diagonal type r is that 


= 
> = Where (5.1) 


i=l 


In particular, let r=2 for a diagonal matrix of type 2. Equation (5.1) reduces 
to: 


= 
12013 

= 
= 


= 


= Van) 


AinVin = Vas fora =n. 


Of course, this is exactly the same condition as that expressed in Section 3 
for a diagonal matrix of type 2. For a diagonal matrix of type 3, the relations 
in (5.1) reduce to the following? 


A13012 + V32) 
13013 + 923023 = 


+ 


+ = Van 


2 For type 3 diagonal matrices, the condition appears also to be necessary. 


3 In all type 3 diagonal matrices examined to date, this relationship appears to hold true in the first equation 
although the term vm has i =j. 
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+ = V4) 


‘ fora = 4 
+ = Vas 
+ = Van} 
AinYin + = Van for a = n. 


6. EXAMPLES OF TYPE 3 DIAGONAL INVERSE 
Given the matrix 


31.4 

7 
A= 4 
— — 2 1 
3 
31 
— 
3! 
5] 


The matrix A is a diagonal matrix of type 3 because of the following rela- 
tionships found when a=3. 


3 3 3 
3 3 3 (6.1) 


2 


22 
O13 — 10 63 


—5 03 + Oe = 1. 


Solving any two of the relationships in (6.1) shows that 0:3;=—3 and 623 
= —2. These values can then be checked in the remaining equations of (6.1) to 
validate the relationship. 

For a=4 in the matrix A, there is another check because there are three 
equations and two unknowns: 


+ 4 

= O14 10 (6.2) 
3 3 


—5 7 Ox 


Il 
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Solution of (6.2) leads to 614=7/4 and 6.4=1/4 and these are validated when 
substituted in the remaining equation. When a=5, there is no check upon the 
solution for 6:5 and 625 since the two equations involve two unknowns: 


=-7 


22 
615 — 10 


—5 05 + 7 Os = 5. 


The inverse of matrix A is a diagonal matrix of type 3 as follows: 


5 3 0 
At= 4 3 
6 
8. 


The general solution for the (nxn) inverse which takes the form of a diag- 
onal matrix of type 3 is calculated as follows: 


1. Calculate: 


Vn—1,n—140—2,n-2 
Unn = — 


where 


Vig Vi,j41 
Ay = = i419 
-1 Vn—2,n—1An—2,n—2 
b. Va-1,n 


An-2,n—2An—2,n—1 


Vij Vi,j42 


942 


and 


ast Va—1,n—14n—2,n—1 
C. Va-2,.n = 


As_3,0—1 An—2,n—2An—1,2-1 


2. Calculate every term of the form »,,, by letting k =3, 4,---, n in the 
same formula 1(c) as above. 

3. To calculate v,",,-, and v,1,,-;, use them as unknowns, and multiply 

any two rows of V by the (n—1) column of V-". This results in two equa- 

tions with two unknowns that can be solved in a fraction of the time that 

would be required to calculate directly the value of v,*,,-, and v,".,-; by 

appropriate modification of the formulas given in step No. 1. 
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4. Proceed to the (n—2) column and use the same technique to derive two 
equations for the two unknowns, and 

5. Proceed in a like manner to solve for the two unknowns in each column 
until the matrix has been completely inverted. 


The procedure of using unknowns in matrix inversion and solving for them 
in this way is not novel but is convenient in the present instance. It is also 
helpful in other situations where a matrix can be partitioned as was done in the 
work cited by Roy and Sarhan. The next section will elaborate on this tech- 
nique in more detail. 


7. GENERALIZATION OF MATRIX PARTITIONING IN INVERSION 


In the cited paper by Roy and Sarhan, a technique of inversion was developed 
for the case where the matrix may be partitioned. In examining this technique 
with matrices generated from problems in the fitting of response-surfaces, cer- 
tain further generalizations appear desirable. 

Consider a complicated-looking matrix that was recently generated in con- 
nection with fitting a response-surface to some data. The matrix was derived 
for a least squares solution by considering the total number of observations, 
the linear, pure quadratic, and mixed terms [5]. 


q 
17 | 22 22 22; 3 3 316 6 6 
| | | 
146 27 27; § 3 3; 6 1 
46 97 | 8 3/11 6 11 
{ 
4; 3 3 Sill 6 
| | | 
8 8100 11 
o7 ll O 
| | | 
| | | 
122 3 8 
| | | 


By partitioning the matrix in the manner indicated by the dotted lines, it 
can be expressed in the following abbreviated form: 
17 22e’ 3e’ 6e’ 
22e 197 + 277 3J +117 
3e 5I+ 3/ 197+ 8J —117 +117 
6e 197+ 3J 


i i 

fi 
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where J = the identity matrix, J =a matrix whose elements are all unity, and e 
is a column vector whose elements are also unity. 
The inverse can be expressed in general form as follows: 


a Be’ ve’ be’ 
al+bJ cl+dJ el +fJ 
cel+dJ gIl+hJ kI+WU 
de el+fJ ml+n/ 
where the letters a, 8, y, 5, a, b, - - - , nm are unknowns to be determined for the 
inversion. 
The values of the Greek letters may be found by multiplying the first row of 


the abbreviated V-' by the columns of the abbreviated matrix V. This results 
in four equations as follows: 


l7a+ 668+ 9%+185=1 
66a 4- 3008 + 427 + 846 = 0 
9a+ 426 + 1297 + 665 = 0 
18a+ 848+ 66y + 846 = 0. 
Solution of these equations yields the following values: 
87 


968 


Using these values in connection with the results obtained by multiplying 
the second row of V- by the last three columns in V, two sets of equations are 
determined, one involving the unknowns associated with the I’s and the 
other associated with the unknowns for the J’s. 

Thus one obtains the following equations: 


(19a+ 5e— 5e)I+[228+(27a+100b)+( 3c+14d)+(1le+28f) |J=17+0/ 
( 5a+19ce—I1le)I+[ 38+( 3a+ 14b)+( 8c+43d)+(1le+22f) |J=071+0/ 
(—5a—1le+19e)I+[ 68+(1la+ 28b)+(1le+22d)+( 3e+28f) ]J=07+0/. 
Equating the coefficients of the I’s involving a, c, and e to 1, 0, 0, they can 
be solved directly as follows: 
3 1 1 


52, 104° 104 

Substituting these values and the one for 8 into the remaining coefficients for 

the J’s, the results for b, d, and f can be obtained as follows: 
35 31 127 
(121) (52) (121) (52) (121) (104) 

In a similar manner, the coefficients for g and k, h and 1 may be found from 

the following equations: 


as 
J A 
4 
968 968 968 ahs 
44 
= 
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(5e+199—11k)I+[3y+( 3e+14d)+( 8g9+43h)+(11k+22) |J =1 [+0/ 
(—5e—11g+19k)I+ [6y+(11e+28d) +(11g+22h)+( 3k+281) =07+0/. 


Equating the coefficients of the I’s and inserting the value of c, the values of 
g and k are found to be as follows: 


Inserting the values of y, c, d, g, and k into the coefficients of the J’s, the 
values of h and / are found to be as follows: 


849 843 
121(520) 121(260) 
Finally, m and n are obtained by a similar process of multiplying the last 
row of V- by the last column of V and are found to be as follows: 
57 


n => 
(121)(520)2 


This information permits the rewriting of V-' as follows: 
a 87 1 


— - 
968 968 968 
3 35 1 31 1 
—[ +——__J 
52 (121)(52) 104 (121)(52)" 104 —(121)(104) 
21 849 23 843 
260  121(520)° 520 —_(121)(260) 
21 57 
—— + J 
260 (121)(520)2° J 


8. NON-PATTERNED MATRICES 


In least squares estimation, patterned matrices are the rule, not the excep- 
tion, when dealing with the matrices derived from the analysis of variance, re- 
sponse-surface fitting, and in order statistics. When least squares estimation is 
applied to problems of regression in general, however, the occurrence of a pat- 
terned matrix is only a fortuitous circumstance. Inversion of such matrices will 
still require laborious calculations performed by a computer. 

To carry out such computations, every writer on the subject appears to have 
his own preferential method. The present authors are no exception. In connec- 
tion with our research for estimating location and scale parameters by order 
statistics, well over a thousand matrices whose rank ranged from 2 to 20 were 
inverted by hand as well as by electronic computer. To economize labor and 
improve precision, the best methods available had to be utilized. 

A modification of the square-root method, as outlined in a research report by 
Sarhan, Roberts and Greenberg [9], was found to be the best by our criteria. 
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In that same report, a new method was proposed for checking calculations 
which is exceedingly helpful to the person doing the computing. In this method, 
rounding errors can never cause discrepancies of more than two units in the 
last digit; hence, real mistakes are readily discernible. 


[1] 
[2] 


[3] 
[4] 
[5] 
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A MULTIPLE COMPARISON SIGN TEST: 
TREATMENTS VERSUS CONTROL 


Rosert G. D. Sree. 
Mathematics Research Center, U. S. Army 
University of Wisconsin 


Let (X.;, X1;, + + + , Xuj;) be the result of a single trial, where the sub- 
script o is associated with a control and the subscripts 1, -- +, k with 
treatments. To test the joint hypothesis P(X;;—X.;>0) =1/2 
= P(X;;— Xj <0), all i, compute the test criterion (ri, - - - , rx) where 
r; is the number of times X;; — X.; is negative in n trials. A method for 
computing the distribution of (r:, «+ - , rx) is illustrated. Exact proba- 
bility distributions of min r; are given for k=2, n=4(1)10 and k =3, 
n=4(1)7. It is conjectured that 2(min r—n/2)//n is distributed ap- 
proximately as Dunnett’s t. Tables based on this conjecture are com- 
puted and values are seen to agree well with comparable values from 
the exact distribution. 


1, INTRODUCTION 


HE analysis of variance is an important tool in the analysis of data. A 
“‘agaauns F is evidence to infer real treatment differences but gives no 
information on their location. The need to locate real differences first resulted 
in independent comparison procedures of which the ultimate calls for independ- 
ent single degree of freedom comparisons. However, in practice, the most mean- 
ingful set of comparisons may not be an independent set. This need gave rise 
to a number of multiple comparison procedures for non-independent compari- 
sons. Among such multiple comparison procedures is that of Dunnett [2] for 
comparing several treatments with a control. Dunnett also provides a method 
for computing a joint set of confidence intervals. 

This paper presents an analogue of Dunnett’s test procedure, a multivariate 
sign test. The data must consist of (k4-1)-tuples, one observation on each of k 
treatments and one control, obtained under a variety. of conditions, possibly 
quite different. The sign test for two treatments, for example a control and one 
treatment, is described by Dixon and Mood [1]. 


2, PROCEDURE 


Let X,; and X,;,i=1,---,kandj=1,-+~--, , be measured responses on 
the control and i-th treatment in the j-th block. The proposed multivariate sign 
test requires the number of + signs or — signs in each of the & sets of n signed 
differences between control and treatment. 

The null and alternate hypotheses are stated in terms of the iocation of the 
medians of the multivariate distributions of the k-tuples of differences 
(d1;, , where we define median (dj;, « , d;) to equal (median - - , 
median d,;). A common hypothesis is that the distributions of the (d1;, - - « , di;) 
have zero medians. 

Let us suppose we wish to test each and every treatment against control for 
_ the purpose of locating treatments that give significantly greater responses than 
control. Then for a procedure using signed differences, the null and alternate 
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hypotheses are: 


H,: Each k-tuple of differences (di;, - - - , , 
has a probability distribution with median zero. 

H,: The k-tuples of differences have probability distributions with common 
median in which one component is greater than zero. 


The procedure for testing follows: 

(1) Compute the signed differences X ,;— X,;,i=1, ---,kandj=1,---,n. 

(2) Observe the number of — signs for each of the k sets of n signs and record 
as7;,,t=1,---,k. 

(3) To judge significance, compare each r; with the single tabulated critical 
value for the desired joint probability level. A significance statement is made 
for each of the k comparisons. The appropriate critical region is one tailed. 

Application of the procedure for the purpose of locating treatments that give 
significantly smaller responses than control is obvious. 

In case H;, calls for a component of the common median to be simply different 
from zere (response significantly different from control), step 2 becomes: 

(2’) Observe the number of times the less frequent sign occurs for each of the 
k sets of n signs and record as r;,i=1, ---, k. 

Step 3 remains the same. However, the appropriate critical region is two- 
tailed. 

Note that small values of r; are declared significant. In other words, a value 
as small as or smaller than the tabulated value is declared significant. 

The joint error rate for the test procedure is an experiment-wise or family- 
wise error rate. It is defined as the proportion of experiments in which at least 
one wrong inference is made when H, is true. 

An experiment-wise error rate makes us highly cautious in experiments with 
large numbers of treatments. This suggests that the significance level might be 
chosen according to the number of treatments, being larger as this number in- 
creases. Tables 772a and 772b give a limited number of complete distributions. 
Tables 769 and 770 are for significance levels of .05 and .01, those customarily 
used with per comparison error rates. 

Ties have not been considered here although they will occur in practice. Only 
ties between treatment and check are of concern in this test. If an even number 
is present in any comparison, assign one-half this number to the appropriate r;. 
If an odd number of ties is present, assign one at random, or in a manner de- 
pendent upon the penalty of a wrong decision, and the remainder equally as for 
an even number. 

The usual modifications of the sign test may also be carried out for this multi- 
variate sign test. In particular, we may test the hypotheses that the distri- 
butions of the k-tuples (X.;—a:X1;,---, Xo;—a.X:;) or of the k-tuples 
(X.j— (Ai +, have zero medians, testing for per- 
centage or additive increases respectively. 


3. EXAMPLE 


The accompanying data are a small part of the results of the Cooperative 
Uniform Soybean Tests, 1956, for the North Central States. They consist of 
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TABLE 769. VALUES OF MINIMUM r FOR COMPARISON OF k TREAT- 
MENTS AGAINST ONE CONTROL IN n SETS OF OBSERVATIONS: 
ONE-TAILED CRITICAL REGION 


k =number of treatment means (excluding control) 


3 4 


ae 2 a 5 6 7 8 9 
95 7 0 0 0 0 
oo 95 8 0 0 0 0 0 0 0 0 

9 1 0 0 0 0 0 0 0 
.99 0 0 0 = 

95 1 1 1 1 1 1 0 
| - .99 0 0 0 0 0 0 0 0 
Bs 95 12 2 1 1 1 1 1 1 1 
a": .99 1 0 0 0 0 0 0 0 
oe .99 1 1 1 0 0 0 0 0 
et 95 14 2 2 2 2 2 2 2 1 , 
7 .99 1 1 1 1 1 1 0 0 
pee 95 15 3 3 2 2 2 2 2 2 
ey 95 16 3 3 3 3 2 2 2 2 
a .99 2 2 1 1 1 1 1 1 
ans 95 17 4 3 3 3 3 3 3 3 
rote .99 2 2 2 2 2 1 1 1 
a } 95 18 4 4 3 3 3 3 3 3 
a .99 3 2 2 2 2 2 2 2 
a 95 19 4 4 4 4 4 3 3 3 
oe .99 3 3 2 2 2 2 2 2 
ba 95 20 5 4 4 4 4 4 4 4 
Petal .99 3 3 3 3 3 2 2 2 
. 
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TABLE 770. VALUES OF MINIMUM r FOR COMPARISON OF k TREAT- 
MENTS AGAINST ONE CONTROL IN n SETS OF OBSERVATIONS: 
TWO-TAILED CRITICAL REGION 


k =number of treatment means (excluding control) 


3 4 5 6 7 8 


ES 
2 9 
95 8 0 0 0 - = 
ae 
95 9 0 0 0 0 i= 
95 10 1 0 0 0 0 0 0 0 ren 
95 il 1 1 0 0 0 0 0 0 . 
95 12 1 1 1 1 0 0 0 0 - 
.99 0 0 0 0 0 — 
95 13 2 1 1 1 1 1 1 1 See 
99 0 0 0 0 0 0 0 0 ie 
95 14 2 2 1 1 1 1 1 1 te 
.99 1 1 0 0 0 0 0 0 Pee 
95 15 2 2 2 2 1 1 1 i we 
.99 1 1 1 1 0 0 0 0 eu 
95 16 3 2 2 2 2 2 2 2 ee 
.99 1 1 1 1 1 1 i 1 ae 
95 17 3 3 2 2 2 2 2 2 fA. 
.99 2 2 1 1 1 1 1 1 fae 
.99 2 2 2 1 1 1 1 1 bate 
95 19 4 3 3 3 3 3 3 3 ae 
.99 2 2 2 2 2 2 1 1 ae 
95 20 4 4 3 3 3 3 3 3 er. 
.99 3 2 2 2 2 2 2 2 ay 
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yields in bushels per acre from two tests in Ontario, three in Ohio, one in Michi- 
gan, two in Wisconsin, two in Minnesota, two in North Dakota, and one in 
South Dakota. C is considered to be the standard or control variety. Clearly, 
the data were obtained under widely differing conditions. 


Mean Yield in Bushels per Acre 
Strain Location 
4 B C D E F 
X 29.2(+) | 21.4(—) | 36.3(4+-) | 40.7(+) | 39.2(+) | 45.6(-) 
33.8 (+) | 29 3(+) | 23.9(+) | 33.3(—) | 37.4(+) | 46.4(-) 
Z 31.3(+) | 29.5(4+) | 24.4(4+) | 30.8(—) | 37.4(+) | 48.5(-) 
23.8 25.4 17.3 33.5 34.9 49.4 
Num- 
G H I J K L - fo 
nuses 


(—) | 19.8(—) | 24.0(+) 


20.5(—) | 26.2(—) | 34.4(+) | 46.1(+) 0 6 
0(+) | 25.7(—) | 20.2(-) |] 5 
0 2 
5 


6 

28.4(+) | 30.3(+) | 32.5(—) | 47.1(+) | 10 
9.0(+) | 29.1(+) | 24.5(4) 
8 27.3 20.8 


28.4(+) | 29.8(+) | 33.5(+) | 44.5 (+) 
24.2 28.4 32.8 44.4 


(These data are used with approval of the Field Crops Research Branch, ARS, USDA, and cooperating agen- 
cies.) 
Reference to Table 769 shows that Z is significantly better than C, the tabu- 
lated value of min r for a=.05 being 2. 


4, DISTRIBUTION OF (71, , 7%) AND MIN 


Consider the (k+1)-tuple that constitutes a single observation. Record the 
differences X ;;—X.j;,i=1, - - - , k, as 1 if negative and 0 if positive. This gives 
a vector of k components, each of which may be a 1 or a 0. The sum of the 
vectors gives the value of the test criterion (ri, » - - , 7%), any component being 
the total number of minuses observed for the particular comparison. 

For any trial, there are (k+1)! equally likely arrangements, when the null 
hypothesis is true, of the (k+1) observations. These give rise to 2* possible 
vectors. Thus if k =4, there are 5!=120 possible arrangements but only 2‘=16 
possible vectors. The vector (1, 1, 1, 1) appears in 4!= 24 possible arrangements 
(X,; is the largest observation) ; the vector (1, 1, 0, 0) appears in 2!2!=4 possible 
arrangements; and so on. The probability with which any vector appears is the 
ratio of the product of the;number of arrangements of the observations on each 
side of the control to the total number of arrangements. 

Denote the set of possible vectors in a single trial by 1, - - - , v, where s=2* 
and their probabilities by pi, -- -, p,. Then the probability of obtaining the 
outcome (ri, + ~*~, 7%) as the sum of the vectors in n trials is the sum of the 
coefficients of one or more terms in the expansion of expression (1). 
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TABLE 772a. EXACT PROBABILITY DISTRIBUTIONS FOR MINIMUM r; 
k =2, n=4(1)10 


Probability of event in column 1 for n= 


minimum r 
is equal to 


OAK © 


(piti + + (1) 


To find a particular term, first solve equation (2) for n,’s, subject to the re- 
striction 


} nv; = (ri, Tk) (2) 


The resulting solutions determine the appropriate terms in the expansion of 
expression (1) in that each solution is also a set of exponents of the z,’s and, 
hence, gives a term. 

Consider the problem of computing the probability associated with a particu- 
lar value of (r:, - - - , rx). For this, we first solve equation (2). The procedure 
for solution will now be illustrated for k=3, n=6 and the right side equal to 
(5, 4, 1); generalization of the procedure is obvious. Note that k=3, hence 
s=2'=8. Write equation (2) as equation (3) including the restriction dins=n. 


TABLE 772b. EXACT PROBABILITY DISTRIBUTIONS FOR MINIMUM x7; 
k=3, n=4(1)7 


he . Probability of event in column 1 for n= 
Minimum r is 


equal to 


4 5 6 7 8 9 10 st 
113 

.364 .251 .161 .098 .058 .033 .0188 i 
.375 .381 .327 .253 .181 .122 .0793 
.012 .062 .221 .273 .289 .2716 

.004 .027 .076 .141 .204 .2461 
O11 .038 .083 
.000 .005 .018 .0466 
~=—-_.0086 
.000 .0007 
I -0000 
\ 
4 5 6 
0 .154 .082 .0430 .0221 
1 .326 .2114 .1339 

2 .336 .389 .3682 .3051 
3 .082 .184 .2775 .3263 

4 .004 .018 .0894 .1692 4 

5 .001 .0104 .0398 aan 

6 .0002 .0035 
7 .0001 
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1111 
1101 
1011 
1001 
(m, +++, Ms) 0111 = (5, 4, 1, 6) (3) 
0101 
0011 


.0001 


Single trial vectors serve as the first k elements of the row vectors in the co- 
efficient matrix and are ordered lexicographically. It is the latter fact that 
makes the procedure for obtaining solutions easy. Begin with the first column 
in the coefficient matrix and the restriction. Since r;=5 and n=6, we need 5 
ones and 1 zero. See step 1 in the accompanying scheme. This part-solution is 
carried into step 2 where the second column of the coefficient matrix is intro- 
duced. The required 4 ones may be obtained in two ways. These two part- 
solutions are now carried into step 3 where each gives three solutions. Notice 
that when any part solution includes a zero, there is no need to carry the cor- 
responding vector into the next step. If, at each step, the solutions are obtained 
in an orderly fashion, there is little chance of missing or repeating one. 

To compute probabilities, we now return to the expansion of expression (1). 
Probabilities p; are computed as described in paragraph 2 of this section. For 
k=3, there are (3+1)!=24 equally likely arrangements of the observations. 
Where X, is the least or greatest observation, there are 3!=6 arrangements 
giving the same vector; where X, is not least or greatest, there are 2!=2 ar- 
rangements that give the same vector. Thus for equation (3) the first and last 
solutions have probabilities 


6! 6724 2 
— and —— - 
3! (24)8 (24)8 


The probability that (ri, r2, r3) =(5, 4, 1) is the sum of the six probabilities so 
computed. This probability applies to each of 12 vectors, those w.th numbers 
which are the 6 permutations of 5, 4, 1 and those with numbers which are the 
6 permutations of 1, 2, 5, the numbers in the complement of (5, 4, 1). Because 
of symmetry, (r1, ---, 7.) and its complement ---, have the 
same probability of occurrence. Note that the probability associated with the 
vector (5, 3, 1) applies to only 6 vectors because the numbers in the complement 
of (5, 3, 1) are simply a permutation of the same numbers. 

When probabilities have been computed for the minimum number of terms 
necessary to construct the complete probability distribution, the sum of the 
products of the probabilities and the number of terms having the specified 
probability serves as a check on the procedure. (Unfortunately, the number of 
terms to be computed increases rapidly as either k or n increases.) From this 
distribution, the distribution of the minimum r; is obtained. This is the required 
distribution. 
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SCHEME FOR SOLVING EQUATION 3 
Step 3 


Step 2 Solutions Solutions 


Step 1 solu- O12 
tion 


from step 2 44 4 3 3 3 


0 


1 


0 


from step 1 
from step 2 


from step 1 


——|——— from step 2 


5. AN APPROXIMATION 


An obvious conjecture is that (rm, ---, 7.) is from a multivariate norma) 
distribution and that (min r—y,)/o, is distributed approximately as Dunnett’s 
[2] ¢ for infinite degrees of freedom. For this approximation, p, =n/2, ¢2=n/4. 
However, Dunnett’s ¢ is computed on the basis that p=0.5, whereas for the 
distribution of (r:, - - - , rx), the correlation between r; and r; is p=4. Roessler 
[3] has computed tables with p=0, which are comparable to Dunnett’s tables 
for two-sided comparisons and joint confidence coefficients of P=.95 and .99. 
A comparison shows that corresponding tabulated values differ only in the 
second decimal place for P =.95 and never by more than .1 for P=.99. (Dun- 
nett’s table gives two decimal places, Roessler’s gives one place.) Since the 
appropriate p lies between those used by Roessler and Dunnett, since the 
Roessler and Dunnett tables differ so little, and since Dunnett gives two 
decimal places, it was decided to use the latter in computing tables. 

Tables 769 and 770 were computed by taking the integral part of the number 
computed by means of equation 4 with ¢ from Dunnett’s tables. Where the 
computation gave negative values, it was assumed that no value of r; should be 
declared significant. The equation was suggested by approximations given by 
Dixon and Mood [1], the final form being chosen as a result of comparing 
computed values with the exact values obtainable from Tables 772a and 772b. 


n—1 
2 2 


r= (Dunnett’s ¢) (4) 


Of the 32 comparable values, 4 differed, the approximation giving no value as 
significant whereas r=0, was significant for the first time with increasing n. 
These discrepancies occurred for k=2, n=8, P=.01 (one tail), for k=3, n=6, 


| 
Part bie 
| solu- 11/4 8 
—|—-- 10 01 10 1 2 
| 1) 5 
1 111 aes 
I: 6 0 0}1 0 00 1/0 01 O11}0 01 

6 6 6 6 6 6 
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P =.05 (one tail), for k=2, n=9, P=.01 (two tails), and for k=3, n=7, P=.05 
(two tails), where equation 4 gave —.008, —.021, —.185 and —.136 respec- 
tively. The corresponding true probabilities for r=0 were .008, .0430, <.008 
and <.0442. 
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THE LADY TASTING TEA, AND ALLIED TOPICS*-f 


N. T. GripGeMAN 
National Research Council, Ottawa 


The well-known discussion of the principles of experimentation, il- 
lustrated by a taste-testing problem, in R. A. Fisher’s Design of Ex- 
periments, is the basis of this expository paper. The notion of a hypo- 
thetical population of identical experiments is defended. It is argued 
that attention must be paid to non-null cases (in testing theory) if a 

. Satisfactory probabilistic model for sensory sorting tests is to be built, 
" and if the efficiency of various experimental designs is to be considered. 
Finally, some remarks are made on the role of randomization, and on 
the problem of “inexact” acceptance regions in discrete distributions. 


1, INTRODUCTION 


N her public but anonymous life of a quarter of a century, Fisher’s tea 
| connoisseuse [1] has provoked widespread attention. The original discus- 
sion concerns the checking of the lady’s claim to be able to tell, either infallibly 
or merely more often than not, whether a cup of tea has been made by the 
addition of the milk to the infusion or vice versa. Prominence, but not exclu- 
siveness, is given to what may be called a double-tetrad sorting design, scil., 
the presentation of four cups made one way and four the other, the whole 
eight being set out in a coded random arrangement, with the request that 
they be sorted, by taste alone, into their two proper subgroups. The issues 
raised in Fisher’s beautifully written chapter are still alive and still productive 
of food for thought. The following reappraisal will be conducted in the light, 
particularly, of what two searching commentators, Neyman [4] and Wrighton 
[7], have had to say about the problem. 


2. NEYMAN’S VIEWS 


Neyman first considers a design unmentioned by Fisher, namely, pair com- 
parison. The versatility of this design, which simply calls for a decision as to 
which of two coded items has a specified attribute, makes it popular in the field 
of sensory perception. Neyman then turns to a cup-by-cup design (each cup to 
be judged independently), and finally to the double-tetrad design, and he 
points out that separate faculties are involved here, the former being a test 
of the lady’s accuracy of identification, and the latter being a test of her dis- 
criminability. He goes on to discuss the hardly avoidable supposition of an em- 
pirical probability p>} associated with the lady’s judgments, but he notes 
difficulties in the way of building a plausible statistical model to contain this 
parameter. His whole treatment is laced with the idea of the power function, 
and he chides Fisher for showing an awareness, at one point, of the “so called” 
error of the second kind, and yet repudiating it. 


3. WRIGHTON’S VIEWS 


In a long paper on statistical models for therapeutic trials, Wrighton [7] 


* Expanded from a paper presented to the Fourth International Biometric Conference, Ottawa, 1958. 
t Contribution from the Division of Applied Biology, National Research Council, Ottawa, Canada. Issued as 
N.R.C. No. 5434, 
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addresses the tea-tasting lady in an atmosphere of stern pragmatism. He 
contemns the “undefined concept of a degree of sensory perception” introduced 
by Fisher (although, in fact, perception is here a misquote of discrimination) 
and adopted by Neyman. Two alternative interpretations are outlined, and 
both are found wanting. The first is that “the lady’s sensory equipment operates 
in a fashion characteristic of the roulette board.” And the second, grounded in 
Fisher’s own exposition, is the identification of “the lady as she is at the time 
of the experiment with a conceptual lady out of whose infinitely protracted 
tea-tasting the experience of the experiment is regarded as a random sample.” 
But our interest, Wrighton protests, lies in the behavior of the flesh-and-blood 
lady, not in her Platonic ideal. His own solution is rather drastic: brushing 
aside the belief that the lady’s limited claim (that she can usually, but not 
invariably, tell the difference) is even amenable to scientific check, he concludes 
that the only worthwhile experiment would be to bring to trial a bevy of ladies 
who claim the faculty. Statements about the bevy would then be possible— 
and Wrighton avers that the making of statements, rather than decisions, is 
the sole legitimate task of the scientist engaged in this kind of work. 

It may be noted that Lancelot Hogben, in his stimulating, iconoclastic and 
tortuously argued book, Statistical Theory {3], supports Wrighton and declares 
that “Neyman’s p, lying somewhere between 0.5 and 1.0, is a tiresome distrac- 
tion which contributes nothing to the business in hand.” 


4. THE HYPOTHETICAL POPULATION 


To rescue the lady from the imbroglio let us begin with Fisher’s notion of a 
hypothetical population of experiments. This puts no strain on my imagination, 
nor do I think, pace Wrighton, that it shoulders the real lady out of the picture 
in favor of a conceptual lady to whom alone credit or blame is due. To appreci- 
ate Fisher’s ideation, we must remind ourselves of his consistent attitude, dat- 
ing back to his work on another kind of ¢ test, towards exiguous observations 
and statistical hypotheses. In the present connection he may be paraphrased as 
follows: “Our field is scientific research. Now statistics also embraces sampling 
problems, industrial quality control, and various other subjects in which popu- 
lations of real items or events are being studied. Much ingenious theory has 
been developed to cope with these things, but it is naive and perhaps danger- 
ous to assume that that theory can be shifted en bloc into the laboratory for the 
indiscriminate use of the man who has a handful of experimental results to in- 
terpret. This man is better served by the concept of a hypothetical popula- 
tion of identical experiments, and our theory of testing allows him to make rea- 
sonable judgments as to what to believe or what to do next. There is no ques- 
tion of his spending a professional lifetime making decisions of which a pre- 
determined proportion will lead to wrong action. Each experiment is unique, 
and each conclusion is tentative and revocable.” 

This is the sense then in which we can talk about a hypothetical population 
of identical tea-tasting experiments. It introduces a disarming simplicity. While 
Neyman worries about a statistical model to accommodate p+4, and while 
Wrighton inveighs against the use of nonholonomic experimentation, Fisher 
quietly says to the lady, “I shall uphold your claim if, and only if, you sort 
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these 8 cups correctly.” As an intellectual exercise, or as a pathway to more 
sophisticated researches, we may discuss a degree of sensory perception (as 
Fisher himself does) and its relation to an empirical probability (as Fisher 
does not), but these issues are not immediately germane to the job of testing 
the claim. Moreover, from this point of view it becomes a matter of indiffer- 
ence which of the following two hypotheses holds: 


(i) That all ladies are binomially classifiable according to whether or not 
they can detect the particular tea difference. 

(ii) That all ladies may be ranked according to their degree of sensory per- 
ception. 


5. THE EMPIRICAL PROBABILITY 


We must however go further. Scientists are rarely given ladies and cups of 
tea to experiment with, but structurally similar problems frequently come their 
way. Sometimes, as for example in psychophysics, hypothesis testing is less 
important than parameter estimation. Now a statistical mode! for general use 
in this area cannot be constructed without the introduction of an empirical 
probability. Consider the observation, common to many investigations of 
sensory perception, that in replicated trials, by a single subject, of the presence 
of a marginally detectable stimulus, the frequency of correct responses may be 
less than 100% but significantly higher than random expectation (on the null 
hypothesis). It is then natural to postulate a neural mechanism that may or 
may not respond to the stimulus, and we can reasonably associate the event 
(the response) with a parameter p, (the subscript being added to avoid confu- 
sion with Neyman’s p). 

This idea can be variously accommodated to the tea-tasting problem. One 
way is to invoke separate parameters for the sensory recognition of each kind 
of cup. But the simplest and most manageable scheme is to begin with the 
definition of a single parameter, p,, as the probability of recognition of, say, 
a milk-first cup. Then the probability of the correct allocation of such a cup 
will be a function of choice and chance. Now as this argument and the function 
will apply‘ equally to the allocation of a milk-last cup, we can operationally 
re-define p, as the probability of the recognition of any cup in its group context. 


6. A MODEL 


Suppose the lady is confronted with 2N cups of tea, N prepared one way 
and N the other, and that there is a probability p, of her sensorily identifying 
any one cup. The number of cups of either kind so identified, say X(0<X <N), 
will be a binomial variate with parameter p,. Then on the assumption that the 
remaining 2N—X cups are allocated at random to the subgroups (making 
each up to N) it can be shown that the probability of exactly R “successes” 
(the number of correctly allocated items in either group) is 


x-0 \X/\N —R R N 


forO0<X<REN 
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The derivation and implications of this function, and the lower moments of the 
distribution, have been given elsewhere [2]. For the special case of N =4 (as 
in the lady’s tea problem) the relation between p, and P(R), E(R), and V(R), 
is shown in Figure 779. 

When N =1 we have the limiting case of pair comparison, for which expres- 
sion (1) reduces to 


1+ 0, 
P(R = 1) = TaetT (=Neyman’s p) (2) 


The model can be generalized further to accommodate asymmetric groupings, 
i.e., the sorting off of N items from a group of M#2N, but, as Fisher himself 
remarks, these imbalanced designs have no apparent advantage. An important 
exception is the limiting case of N =1; M>2. This is the test task of picking out 
an oddity, and its simplest condition is M=3, which gives the well-known 
triangular test. 


7. SENSITIVITY, EFFICIENCY, AND POWER 


“By increasing the size of the experiment”, writes Fisher [1], “we can render 
it more sensitive, meaning by this that it will allow the detection of a lower 
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FIG. 779. Characteristics of the double-tetrad sorting distribution (see expression (1)). 
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degree of sensory discrimination, or, in other words, of a quantitatively smaller 
departure from the null hypothesis - - - ; we may say that the value of the ex- 
periment is increased whenever it permits the null hypothesis to be more readily 
disproved.” Exemplifying, he points out that the double-hexad design is more 
“sensitive” than the double-tetrad insofar as only the former can include in the 
5% rejection tail a result with just one mistake—because (1+36)/924< 1/20, 
whereas (1+16)/70> 1/20. 

Now all this is fine, but far from exhaustive. Some consideration of efficiency 
is surely needed. The efficiency of an experiment is increased when the null 
hypothesis can be more readily disproved for given non-null cases and without 
increase of the number of test items (i.e., without a rise in cost). If we admit the 
concept of a degree of sensory perception, and if we biometrize this as a param- 
eter of perception p,, efficiency can be virtually identified with Neyman-Pear- 
son power, and although, as we shall see presently, meaningful numerical values 
are rarely obtainable, we shall at least be on a realistic approach. 

Consider these three experiments: 


(i) The lady is to carry out the double-tetrad sorting test in duplicate. 
(ii) She is asked to do a similar sorting with half the number of cups (i.e., a 
double-pair design) in quadruplicate. 
(iii) She is asked to make 8 replicate pair comparisons. 


Whichever the experiment, the lady will have the same amount of tasting to 
do, and she can make a maximum of eight mistakes, and we can accept her 
claim (at the conventional 5% probability level) if in fact she makes zero or one 
mistake. Therefore, by a slight extension of Fisher’s ideas, all three experiments 
can be called equally sensitive. But if we compute the probabilities of avoid- 
ance of a Type II error for various p,, it turns out that the double-tetrad design 
is consistently the least, and pair comparison consistently the most, powerful. 
This is on the assumption that p, is independent of design, and maybe we are 
wrong about that, but if so it is to be expected that p, will be larger for the 
simpler designs, and this would make the true power differences even bigger 
(in the same ordering). 

These findings will however not necessarily apply to scaled-up experiments’ 
or to other levels of significance. So, bearing in mind that p, is a priori unknown, 
we have to conclude that no general statement can be made about powers. The ‘ 
relative efficiency of the various designs cannot therefore be deduced the- 
oretically. In sum, we may say that, operationally speaking, serisitivity is 
specifiable but unimportant, whereas efficiency is unspecifiable but important. 

Incidentally, laboratory tasters prefer simple test units (such as pair com- 
parison), which probably stimulate discriminability. The double-tetrad design 
is a practice uncommon. Fisher of course uses it to illustrate principles and 
does not urge its pre-eminence. 


8. RANDOMIZATION 


Fisher says that randomization is “the physical basis of the validity of the 
test.” But his thread of argument is not overly easy to trace in this context. 
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Elsewhere, discussing randomization in the analysis of variance, he is abun- 
dantly clear, yet in the tea-tasting chapter he seems indifferent to the danger of 
bias—psychosensory, not statistical—when inter-cup differences are present 
and merely randomized. At one juncture he asks us to imagine that some of the 
cups are made with Indian, and some with China, tea; and although he dep- 
recates this because it may inhibit the lady’s discriminatory faculty, he is 
emphatic that the extraneous difference does not affect the validity of the test 
—provided that the cups are allocated at random to the subgroups. This 
seems to be true only if the hypothetical population is also conceived as a 
random sample from a superpopulation. 

Suppose that 3 cups are of Indian tea and the other 5 are of China, and that 
the randomization allocates all the Indias to the milk-first group. This sug- 
gests as the null hypothesis, “The lady cannot distinguish the twe methods of 
tea pouring when three-quarters of the milk-first cups are of Indian tea, and 
when one-quarter of the milk-first cups and all the milk-last cups are of China 
tea.” The limitation is glaring. The lady may subconsciously associate Indian 
tea with the milk-first method, in which event the theoretical probabilities on 
which the test is based will be spurious. 

The introduction of a real population of experiments will brighten up the 
landscape here. But the brightness is only apparent. If the lady intended to 
visit many other laboratories to have her claim checked by experimentalists 
each of whom agreed to use 3 cups of Indian tea and 5 of China and each of 
whom randomized unconditionally, I should be happy to accept my own ran- 
dom allocations, no matter how unfair they looked. However, all this would 
be cold comfort because a salient merit of Fisher’s approach is its independence 
of real populations. A more satisfactory solution is to randomize with the re- 
striction that known extraneous differences are orthogonally balanced. With a 
3:5 or any uneven split, such balancing is of course impossible. 


9. DECISION RULES AND ROULETTES 


Remarking on the Type I error in sorting designs, Neyman says that, 
“Operationally, this means that if, in the course of humanity’s research work, 
assertions of new discoveries are made consistently on the ground of experi- 
ments arranged like the one discussed, then, out of all those cases where the 
phenomenon does not exist, the frequency of false discoveries will be less than 
5%.” Despite the well-rounded phrases, and despite the utilitarian note struck 
by the initial adverb, this sentence is at best controversial and at worst un- 
realistic. However, let us accept it at face value for the moment. Notice the 
less than 5%; Neyman is here providing for the fact that discrete distributions 
seldom oblige with a step at exactly the 5% point. He later describes the con- 
sequences as sometimes “disconcerting,” and he wonders whether it would not 
be better on occasion to switch from a conventional @ such as 0.05 to one nearer 
the actual demarcation. 

But if Neyman’s thesis is accepted, its application can easily be tidied up. 
With the double-tetrad design the probability of an all-correct result (when H, 
is true) is 1/70, and that of a single mistake or all correct is 17/70. As conven- 
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tionalists we are interested in the 5% point, and this is 33/70, which is out of 
reach, as it were. So let us construct a roulette wheel of 32 equal sectors, 27 
black and 5 red. The following rule can now be laid down: If the lady makes no 
mistakes, accept her claim; if she makes two or more, reject it; but if she makes 
just one mistake, spin the wheel and accept on red and reject on black (because 
[1+16(5/32) ]/70 = 1/20). Provided that we contract to use the wheel, mutatis 
mutandis, for all future experiments of a similar kind, the expectation of a Type. 
I error will then be precisely 5%. The principle here is not new; it was ex- 
pounded by Tocher [6] in 1950, although he used a table of random numbers 
instead of a mechanical apparatus. 

In judging the case for the roulette wheel we might begin with the reflection 
that any argument against the employment of a gambling device in the inter- 
pretative stage of a statistical experiment can almost certainly be applied to the 
design stage. It so happens that the design stage was the first to come under 
the influence of randomization theory, and we commonly use gambling devices 
at this stage, and qualms are rare. Why then should we have qualms about 
the roulette wheel? 

Maybe at this point we should go back to the distinction, implicit in much 
of Fisher’s writing, between two classes of statistical testing and testers. The 
research worker, engaged in making provisional decisions about scientific hy- 
potheses, is more likely to be interested in the actual probabilities of error than 
in critical significance levels; the notion of automatic decision making is foreign 
to his outlook; and he will have no part of a roulette. On the other hand, the 
market-survey worker or the quality control man is often faced with irrevocable 
decisions, and it may at least be argued that the roulette wheel, or some equiva- 
lent mechanism, should be among his tools. Yet who would dare use it? Hardly 
anyone, and the moral perhaps is that we must be wary of over-extending sta- 
tistical theory when dealing with real situations. 

Nevertheless, one application, heavily disguised, of the roulette idea has been 
urged in the field of taste technology. Seeking to improve binary-comparison 
“triangular” tests by closing in on the point a=0.05 or 0.01, Roberts et al. [5] 
neither spin a wheel nor consult a table of random numbers; they do what is 
crudely similar—they inspect the distribution of like decisions within a group 
of judges. As the judges are implicitly assumed to be of equal sensory acuity, 
and as, under the null hypothesis, the inter-judge distribution of results is 
fortuitous and irrelevant, the method advocated plainly constitutes a built-in 
gambling device. But the exposition is couched in wholly alien terms of “the 
use of the multinomial rather than the binomial distribution,” without even 
an oblique reference to the nub of the matter. Indeed, the authors seem un- 
aware of the delicate ground they tread. Which just goes to show, as we said at 
the outset, that the issues raised by The Lady Tasting Tea will still repay study. 
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TABLES FOR THE SIGN TEST WHEN OBSERVATIONS ARE 
ESTIMATES OF BINOMIAL PARAMETERS 


Artuur CoHEN 


A necessary condition for the sign test is the following: uncer Hp, 
Pr (x Pr [x =y]=4, where z and y are observations before and 
after treatment respectively. However, if the observations before and 
after, on the individual are maximum likelihood estimates of the same 
binomial parameter this condition will not always hold. A table measur- 
ing the amount by which these probabilities differ from } is given for 
values 0 (.1) 1 of the true parameter being estimated, and sample sizes, 
on which the estimates are based, of 1 through 8. Since for fixed sample 
sizes, the amount by which these probabilities differ from } varies with 
the parameter being estimated, another table gives the maximum ab- 
solute deviation from 3, as the parameter yaries from 0 (.005) 1. A 
theorem by W. Hoeffding leads to the suggestion of a conservative test 
of H, which entails use of the tables. Another simple conservative test 
which also makes use of the tables is suggested and an indicator of the 
amount of approximation involved in this test is given. A numerical 
example is offered. 


1. INTRODUCTION 


cumstances, is a handy and useful tool for judging the significance of the 
differences between two treatments [1]. Suppose on each of k classifications 
an observation is taken on each of two treatments. Call the observations on the 
ith classification under treatments one and two, 2; and y; respectively. The null 
hypothesis H, asserts that the median of z;—y; equals zero for every i. No 
assumption regarding relations between the distributions of 2, x2, --- , 2s, 
nor regarding relations between those of y:, yo, - ++, yx are made except that 
all variables x1, 22, - - - , yx are independent. The alternative hypothesis is that 
observations under the second treatment are consistently lower than those un- 
der the first treatment for most or all of the classifications. We carry out the 
sign test by counting the number S of differences z;—y,; which have positive 
signs. If we assume there are no cases such that 2;=y, then under H,, S is bi- 
nomially distributed with p=}, i.e. for each 7, 


Pr [x; > yi] = Pr [2 < yi] = 4 


If there are cases where 2;=y; we agree to divide them with equal probability 
and again under H,, S is binomially distributed with p=}, i.e. for each 7, 


Pr [xi > ys] + 4Pr [zi = ys] = Pr < ys] + [xs = ys] = 4. 
Under the alternative, S would tend to have large values and therefore we 


wculd reject for large S. The other one-sided alternative or the two-sided alter- 
native can be treated similarly. 

Suppose now that in the ‘th classification, z; and y; are proportions of suc- 
cesses in m, and n; Bernoulli trials respectively. Hence, 2; and y; are maximum 
likelihood estimates of binomial parameters p,; and p,;. As such, if m; and n; are 
unequal, and even if p.;= py: (except if p.:=p,:=}, 0, or 1) then 


Pr [xi > ys] + 3 Pr = yi] }. 
784 


_ sign test, because of its simplicity and validity under very general cir- 
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For example, say pzi=pyi=.3 and m;=3, n;=1. The tabulation of the joint 
probability distribution of x; and y;, permits us to find Pr [z;>y,;]+43 Pr [z;=y;] 
=.584 by merely summing the entries in the appropriate cells in Table 785. 
This clearly depicts a violation of the sign test model, namely that under H,, 
S will not be binomially distributed with p=}. 

Since the sign test has decided advantages we would like to be able to use it 
even in the situation described in the above paragraph. It is for this reason, and 


TABLE 785. JOINT PROBABILITY DISTRIBUTION OF z; AND y; 


Marginal 
Probabilities 


-3430 


-4410 


‘ ‘ 1890 


1 .0270 


Marginal Probabilities 1.0000 


because the sign test may have previously been used indiscriminately in the 
above situation, that we propose tables which measure the departure from the 
crucial condition that, under Pr [z;>y;]+4 Pr [z:=y,] =}. 

To illustrate, suppose we wish to determine whether a new antibiotic treat- 
ment reduces the amount of a type of staphylococcal infection. On the ith of k 
individuals we take m; cultures before the treatment is given and n; cultures 
after the treatment is given. We assume m;, and n; are unequal for at least some 
of the k individuals. (In fact they often are unequal since the culture could be- 
come contaminated, the individuals do not report every time they are requested 
to, or the culture may not have been taken properly.) Hence x; would denote 
the observed proportion of this type of staphylococcal infection for the m; cul- 
tures drawn from the ith individual before treatment, and y; would denote the 
observed proportion of infections for the n; cultures drawn from the 7th indi- 
vidual after the treatment. 


2. DEFINITIONS AND PROPOSAL OF TABLES 


Let x; be the proportion of successes in m; Bernoulli trials with probability 
p:, and y; be the proportion of successes in n; Bernoulli trials also with probabil- 
ity p:. This is the condition under the null hypothesis, which we shall assume 
to be true in the following development. 

Now the use of the sign test under conditions described in the preceding sec- 
tion was impeded because under H,, Pr [x;>y,;]+}3 Pr [z;=y,:]}. If how- 
ever, we found that this value was very nearly 4 we might still be able to use 
the sign test or make some modification to enable its use. For this purpose we 
define 
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G(mj, ni, pi) = Pr [zi > yi] + 4 Pr [ai = yi] — 3, for all i. 
We also define 
D(mj, ni) = | G(mi, ni, pi) | 


Now for given m; and n,, if D(m,,n,) is small and the number of the k classifica- 
tions for which Pr [z;>y;]+4 Pr [x,:=y,]} is few, then on some occasions 
we would not hesitate to proceed with the usual sign test. 

Whether the D(m,, n;) are small or not for given m; and n;, and whether the 
number of the k classifications for which Pr [z;>y,]+3 Pr [z:=y:]#} is few 
or not, we will always be able to make a conservative test of H, against the one- 
sided alternative. (Conservative in the sense that, if a is the prescribed level of 
significance for the conservative test of H, against the one-sided alternative, 
then the probability of an error of the Ist kind will not exceed a.) One such 
conservative test, and probably the most applicable one for this situation, de- 
pends on a theorem of Hoeffding [3]. 

Theorem [3, p. 718]. Let S be the number of successes (plus signs) in k 
independent trials and let q; denote the probability of success in the ith trial, 
i=1, 2, ---, k (Poisson trials). If ES=kgq, and b and ¢ are two integers such 
that 


O<b<kqaq<c<k 
then 


k 
— qg)*" < Pr(b< S <c) < 1. Both bounds are attained. 


rb 


The lower bound is attained only if q.=q2:= - - - =qe=q unless b=0 and c=k. 
“The lower bound for Pr (6S Sc) shows that the usual one-sided test for the 
constant probability of “success” in k inkependent (Bernoulli) trials can be 
used as a test for the average probability g of success when the probability of 
success varies from trial to trial” [3, p. 720]. More explicitly, if our test says 
to reject H, if S>c. (ca an integer larger than kg) then the theorem says, 


ta k k k 


r=( T=Cqt+l 


Q 


H.] = Pr[S>c.| 
indicating the probability of rejecting H, when it is true, will not exceed a. 
To see how application of this theorem leads us to a conservative test, let 
1 with probability 4 if z; — y; = 0. 


Il 


Hence z; is a binomial variable with mean 


Ez; = Pr [z; = 1] = $+ G(mi, ni, pi) < + D(mi, ni). 


vp 
| 
| 
| 
Ai 
i 
iv 
i 
| 
d 
F 
4 
ie 
: 
— 
| 
i 
4 a 


TABLES FOR THE SIGN TEST 


If we let S= >-}_, z,, then S has mean 


k k k k 
dX G(mi, ni, pi) dX Dimi, ni). 
t=] t=] 

If the G(m;, nj, p,) are not equal for all 7, then the z; represent a sequence of 
Poisson trials with probability of success 3+G(m,, ni, pi) $3+D(m,, n,). Sup- 
pose the alternative hypothesis is that probabilities for individuals are smaller 
under the second condition than under the first condition, and we consider 
these z;, having probability of success equal to }+G(mj, nj, p;) and hence an 
average probability of success 

1 >> G(m, ni, pi 

= 


2 k 


which we denote by g. Then S has mean 


k 1 G ty PE 


t=] 


If we specify two integers b and ¢ such that OS$bSkqSc3Sk then all conditions 
of the theorem are satisfied and we can say that testing the hypothesis that S is 
binomially distributed with parameters k, q is a conservative test of H,. 

However we are unable to carry out this test unless we can specify g. Notice 
q depends on the G(m;, n;, pi) which, for fixed m; and n,, are unknown unless the 
p; are known. If we substitute D(m,, n;) for G(m,, n;, p;) and test the hypothesis 
that S is binomially distributed with parameters k, and 

k 

then once again we have a conservative test for H,, one which is more conserva- 
tive than the test of S being binomial with parameter g. (We assume that the 
parameter for S under the alternative hypothesis is substantially greater than 


1 > n;) ) 
2 k 
To show that the test using q’ is more conservative, notice that if a is specified 
for this test, the critical value cq (let c. be an integer) is determined from the 
equation 
k 


k 
a= - 


Now 2q, which implies 
k 
a= ( ara 
r 


as is easily verified from the identity 
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This indicates the test using q’ is more conservative than the test using g. This 
conservative test is easily carried out with the aid of the tables in this paper and 
easily completed for k<50 with “Tables of the Binomial Probability Distribu- 
tion” [4]. 

Considerable improvement in the test can be made if it were known that p; 
lay in some interval J, in which case we could substitute the 


sup | ni, pi | 


for D(m,, n,) in the formula for q’, thus approximating more closely the true 
average of the probability of success. 

Naturally if the G(m,, n;, p;) are unkown but their average is known we need 
go no further than the test suggested by Hoeffding’s theorem. 

Suppvuse now that k is large enough for application of the central limit theo- 
rem and the z; and S.are defined as earlier in this section. Then as ko the 
distribution of S is asymptotically normal with mean 


k 
— + G(m,, nj, satisfying the inequality 


k 
— Dim, < + ni, ps) < + D(m,, n,) and variance 


k 
— > G*(m;, nj, p,) satisfying the inequality 


mle ele 


k k k 
— > Dm, ni) < ni, pi) < 
t=1 


If we decided to use the conservative test which we were led to by Hoeff- 
ding’s theorem, thereby testing the hypothesis that S was binomial with 
parameters k, 


4 ni) 


, 


_ 
2 


and invoked the central limit theorem then we would reject H, if 
S — kd’ 
> 
— 
S > cavieg’'(1 — 7) + kg’ = ba 


Ca. or if 


where ¢c, is the critical point found from the standard normal tables and we call 
b. the transformed critical point. 
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One question raised here however is, how large should k be before we can use 
the normal approximation to the binomial distribution so that the error of ap- 
proximation is not appreciable relative to the correction made by the D(m,, n,). 
Another reiated question is, will the error due to the approximation cause us to 
lose the conservative aspect of our test. This can happen, since by taking the 
largest possible mean of S, we can be underestimating its variance and hence 
underestimating the transformed critical point. Hald [2] gives some indication 
of the error of the normal approximation to the binomial. From his tables we 
see that if g’ lies between .4 and .6 then k can be somewhat less than 40 and the 
approximation will still be very good. (The differences between the cumulative 
distribution functions being less than .001 for g’ =}, k=36.) 

To investigate the related question we know that the conservative test we 
were led to by the theorem requires us to reject if 


S > (1 — 7’) + kq’ i 


If for fixed c, and k the right member of this inequality was an increasing func- 
tion of q’ this test would always be conservative since the transformed critical 
point would always be greater than the true transformed critical point required 
for testing H,. (Hence the probability of rejecting when H, was true would 
always be less than or equal to a.) To see that the right member is not always 
an increasing function of g’, we consider its derivative with respect to q’ which is 


CaVk 
k + (1 — 29’). 


Obviously for some values of k, ca, and g’ (for example, large q’) this derivative 
becomes negative. In most applications of this method, where we consider k 
to be large enough for application of the central limit theorem, the approxima- 
tion will rarely interfere with the conservatism of the test, yet we offer another 
test which will always be conservative if k is large. This test says to reject if 
S2caVk/4+kq’ and is always conservative for large k since the upper bound 
of the variance of S is k/4. 

It may be of interest to give an indication of all the approximation involved 
by considering this last and most conservative test. We do this by finding the 
difference between the upper and lower bounds on the true (transformed) criti- 
cal value. We have already mentioned how we find the upper bound on the true 
critical value, that is caWk/4+kq’ =CaVk/4+k/2+ The lower 
bound is found by considering S with mean k/2— }>D(m,, n,) and variance 
k/4— n,) and therefore is 


k 
Hence a measure of all approximation involved would be 
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3. EXPLANATION OF TABLES 


Since G(m, n, p) is largest when at least one of m and n is small (and m ¥n), 
Table 785 gives values of G(m, n, p) for m and n varying from 1 to 8 while p 
assumes values of .1, .2, .3, .4. Obviously if mo and n—~, then G(m, n, p) 
—0. The symmetries of the function G(m, n, p) namely G(m, n, p) 
=—G(m, n, 1—p)=—G(n, m, p) mean that the tables provide values for 
p=.9, .8, .7, .6 and that we need only present entries for m>n. 

As previously implied G(m, n, 0) =G(m, n, 4) =0. To examine the behavior 
of the function for all m we include a column lim,,... G(m, n, p) for the different 
values of n and p. 

Table 791 gives values of D(m, n) for m and n varying from 1 to 8 with 
m>n. Strictly, although D(m, n) =sup, | G(m, n, p)| the entries in the table 
give only sup, | G(m, n, p)| where p ranges from 0 to } in steps of (.005). Below 
each entry in this table, we indicate in parentheses the value of p (less than 4) 
at which |G(m, n, p)| attains this supremum. The entries below the diagonal 
in this table could easily be filled in since D(m, n) = D(n, m). 


4. NUMERICAL EXAMPLE 


A series of cultures was taken from each of 32 volunteers to examine them 
for the presence of a staphylococcal type infection before and after the volun- 
teers took a uniform dose of a new antibiotic. The number of cultures taken on 
individuals varied since several of them became contaminated, some were not 


taken properly, or the volunteer failed to appear every time he was requested 
to. It was desired to determine whether the antibiotic reduced the presence of 
this type of staphylococcal infection in individuals. This hypothesis was testea 
by the sign test model of this paper. That is, we tested H,: Pr [x;>y.]+? 
Pr [x;=y;]=4+G(m,, nj, p,) against 


H,: Pr [xi > ys] + 3 Pr [xi = yi] > § + G(mi, Ds). 


where x; represents the proportion of infected cultures drawn from the 7th indi- 
vidual before treatment and y; represents the proportion of infected cultures 
after treatment. Because of the conservative aspects of the test we use, the 
alternative really is 


Ha: Pr > yi] + 3 Pr = ysy > § + D(m,, n,), 


The number of infections and the number of cultures for each volunteer before 
and after taking the antibiotic are listed as fractions in table 793. We assign a 
plus sign to each individual if the proportion of infection found from his cul- 
tures is greater before the antibiotic treatment than after the treatment. If 
the proportions before and after are the same, we would assign a plus sign 
with probability 3}. Otherwise we assign a minus sign. For each individual we 
also list D(m,, n;) which were found in Table 792. 

We let a, the level of significance, be equal to .05 and we use the conservative 
test which considers the hypothesis that S=number of plus signs and is bi- 
nomially distributed with parameters 


# 
= 
Kt 
\ 
A 
3 
‘ | 
| 
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TABLES FOR THE SIGN TEST 
TABLE 791. G(m, n, p) 


-10476—tj. .16085 20629 
-13632—.j. -19515. 28289 
-15893 17983 
-06624 -08682 .09509 


-04932—. -09631 - 18615 
-04224 .07465 .09698 
-02226 .02970 .02881 
.00768 —. -00430 —. -00387 


.00266—«. 05112. .08135 
— .02081 03094. -04715 
— .02865 -01476 —. 03292 
—.01885 -00683 —. .02430 


-01550_—. .04782 
.00601 -02369 
-00861 .01251 
.00761 -00587 


02776 -02508 
-05298 —. -01569 
.04443 .00094 
.02338 -01031 


-01294 
-01188 
-00887 
-01330 


.05028 
.00629 
-04369 
.02118 


G(m, n, p) = —G(m, n, 1—p) = —G(n, 


1 D(m,, n; 
Dems 


k 


k = 32, and >> D(m,, n) = 1.38654 
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i 2 3 4 5 6 7 8 lim 
1 | .086 .0720 .40000 
.2| .048 .0960 .30000 
.3| .042  .0840 - 20000 
.4| .024  .0480 .10000 
2 | .1 .0198 .31000 
.0096 .14000 
3 ~ .0021 — .01000: 
| .4 — .0048 — .14000 
3 | 1 22900 
2 .01200 
3 — .15700 
. 14800 
— .01344 .15610 
| .04083 —.09040 
| .3 .04040 .15170 
snes 4 | — .02308 — — .02480 
5 | .1 .09049 
2 .03248 
3 02822 
6 | .1 — .04005 — .03144 
| 2 .05972 — .15536 
ae. 3 — .04483 — .07982 
he 4 — .02242 .04432 
714 — .02170 
3 07672 
aie 3 .14707 
oe 4 - — .08010 
.06953 
2 pre .00332 
3 05177 
4 .09409 
Since 
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TABLE 792. D(m, n) 


.04811 .09622 .13640 .16896 .19562 .21779 .23653 
(.210) (.210) (.205) (.195) (.190) (.180) (.170) 


.01985 .05063 .07321 .09766 .11764 .13695 
(.095) (.125) (.115) (.115) (.115) (.110) 


.02877 .03358 .05158 .06442 .08317 
(.290) (.095) (.090) (.075) (.085) 


-04368 .02080 .04526 .05208 
(.245) (.055) (.215) (.070) 


-05328 .02064 .02888 
(.215) (.365)  (.055) 


05995 .01840 
(.190) (.145) 


-06484 
(.170) 


D(m, n) = D(n, m); D(m, m) =0. 


5 1.38654 54983 
. + 39 ID. 
With k appearing large enough for application of the central limit theorem our 
test is to reject H, if 


S > 1.64./32(.54333)(1 — .54333) + 32(.54333) or if S > 22.0081. 


Since we have 24 plus signs we reject H,. If we used the more conservative test 
we would reject if S>1.64./32/4+32 (.54333) or 


if S > 22.0278. Again with 24 plus signs we would reject H.. 


To get an indication of all the approximation involved we would have to find 
the lower bound on the true critical value which would be 


32 
1.644/— — .092967 + 32(.45667) = 19.2251 


Hence a measure of all approximation would be 22.0278 — 19.2251 = 2.8027. 
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TABLE 793. PROPORTIONS OF INFECTED CULTURES 
BEFORE AND AFTER ANTIBIOTIC TREATMENT 


Proportion Proportion 
Person Sign | D(m, n) | Person Sign | D(m, n) 
Before | After Before | After 

1 1/4 3/5 - .04368 x 2/4 0/3 + 02877 
2 2/4 1/3 + .02877 18. 4/5 1/4 + .04368 
3 3/5 0/2 + .07321 19. 1/2 1/2 = .00000 
4 4/6 1/2 + .09766 20. 6/8 3/6 + 01840 
5 4/7 1/3 + .06442 21. 1/3 1/2 - 01985 
6 1/4 0/4 + .00000 22. 1/2 2/3 ~ 01985 
7 1/1 2/3 + .09622 23. 4/7 2/5 + 02064 
8 2/5 4/8 - .02888 24. 2/2 0/1 oS 04811 
9 2/4 2/5 + .04368 25. 1/3 0/2 + 01985 
10 1/3 3/4 - .02877 26. 2/5 1/4 + 04368 
11 3/6 1/2 + .09766 27. 4/6 0/5 + 05328 
12 5/7 4/6 + .05995 28. 1/3 0/1 + 09622 
13 2/4 0/1 + . 13640 29. 1/3 1/2 - .01985 
14 2/4 1/3 + .0287 30. 3/5 1/5 + .00000 
15 3/8 3/5 - .02064 31. 1/2 0/1 + 04811 
16 2/3 2/4 4 .02877 32. 3/4 1/3 + 02877 


I would also like to thank the’ Rich Computer Center, Atlanta, Georgia, for 
use of their electronic computer. 


REFERENCES 


{1] Dixon, W. J. and Mood, A. M., “The statistical sign test,” Journal of the American 
Statistical Association, 41 (1946), 557-66. 

[2] Hald, A., Statistical Theory with Engineering Applications, New York: John Wiley and 
Sons, Inc., 1952, 580-3. 

[3] Hoeffding, W., “On the distribution of the number of successes in independent trials,” 
Annais of Mathematical Statistics, 3 (1956), 713-21. 

[4] National Bureau of Standards, “Tables of the Binomial Probability Distribution,” 
Applied Mathematics Series 6. 


- 
© 
Jagr 
4 
oth 
— 


COMPARISON OF ESTIMATES OF CIRCULAR 
PROBABLE ERROR 


P. B. Moranpa* 
Autonetics 


The high cost of flight-testing weapon systems places an extraor- 
dinary premium on efficient use of sample dats. Although efficient meth- 
ods of treating data are well known to the statistician, the personnel 
assigned to execution of testing and analysis are frequently not aware 
of these methods and, as a result, may not form efficient estimates 
with the raw data. This study formulates naturally motivated estimates 
of the Circular Probable Error (CEP), an extensively used figure of 
merit. The first two moments of each estimate are obtained and from 
these tables of unbiasing factors and variances of each estimate are 
formed for sample sizes ranging from 2 to 21. The efficiency of the 
estimates relative to the best linear unbiased estimate are obtained and 
tabled for the same range of sample sizes. 


1. INTRODUCTION 


COMMON parameter for describing the accuracy of a weapon is the so-called 

Circular Probable Error, which is the two dimensional analog of the Prob- 
able Error of a single variable: just as the Probable Error measures the half 
width of the mean-centered interval which includes 50% of the normal (Gaus- 
sian) probability mass, the Circular Probable Error measures the radius of the 
mean-centered circle which includes 50% of a bivariate probability mass. The 
Cireular Probable Error is almost uniformly tagged by the letters CEP and, 
to conform with this sequence of letters, other names such as Circle of Equal 
Probability and Circular Error Probable have been used. 

Results of weapon tests will have the form of records of deviations of the 
impact point from the target center; these will be commonly given as measure- 
ments along two orthogonal directions which can be labeled x and y. On the 
basis of a sample (generally small in size) it is desired to estimate the CEP. 
In the present development it is assumed that the z and y errors are independ- 
ent and that they each have a normal distribution with mean zero and vari- 
ance 

The determination of the “best” estimate of CEP under the assumptions 
made above has been made by Chapman and Robbins [1]; ..owever non-best 
estimates should not be dismissed from consideration. Other estimates while 
not as efficient as the “best” may have more robustness under variations in the 
parameters of the assumed distribution or may be easier to compute. 

In this paper four different estimates are developed and compared. 


2. ESTIMATION 


Under the assumption that z and y errors (or deviations from the aim point) 
are independent and normally distributed with mean zero (known) and common _ 


* Now with Range Systems Operation, Aeronutronic, Newport Beach, California. 
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CIRCULAR PROBABLE BRROR 


variance o? (unknown), the joint density for z and y is 
@, = ew {- (1) 
z,y) = exp —~ (x 


From this joint density it is easy to derive the density function for the radial 
error r= Vf 2?+y?: 


{ r? (2) 
f(r) = 


The cumulative distribution function, F(t), is then (for t>0) 
t t? 
F(t) = Prob |r < ¢} -f f(r)dr = 1 — exp \- <i (3) 
o o 
and the Circular Probable Error (CEP) is the positive root of the equation 
F(t) = .50; 
thus 
CEP = 1.7740 (4) 


Since the CEP or 50-percentile of the radial error is related to the parameter 
a in the fashion shown in (4) the problem of estimation of CEP is one of obtain- 
ing a function of the n pairs of sample points (21, y:), , Yn) 
which will estimate 


3. MAXIMUM LIKELIHOOD ESTIMATE (CEP;) 


The Maximum Likelihood Estimate for ¢ is easily found to be: 


n= 


This estimate has a slight bias; the adjusted unbiased estimate for the CEP is 


CEP, = 1 i774( vin ) ( 24+ y,2) (5) 
UP, = 1.1/4 ” +] On Yi 


where the quantity in parentheses is the unbiasing factor for 6. The variance 
for this estimate is 


Var (CBP) = (1 1774)" (6) 
ar = n (2n + 1)] 


It has been shown by Chapman and Robbins [1] that the CEP, is the most 
efficient unbiased estimate possible. Comparison of estimates is made on the 


basis of the relative efficiency, which is the ratio of the variance of CEP, ta 
the variance of the estimate. 


| 
te 
fox 
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4. “PROBABLE ERROR” ESTIMATE (GEP:) 


A second estimate for o (or CEP) is a slight variant of the one commonly 
used in establishing the so-called Probable Error; deviations are computed 
from the mean of the sample instead of the true mean. This estimate cor- 
responds closely with the quantity commonly referred to in engineering work. 
The estimate is the maximum likelihood estimate of ¢ if the bivariate distribu- 
tion of x and y has a circular distribution centered at an unknown point 
(m, m2). This estimate is manifestly more robust than CEP, in the presence of 
variations in the center point. 

In accord with previously used notation 


1 n 
= x {(xi — #)? + (ys — 


where £ and # are the sample means of the 2’s and y’s respectively. 
This estimate also is biased; the adjusted unbiased estimate for CEP is 


GBP, = 1.174( vn Tii@n — — + (yi — (7) 


and the variance is 


1) 
Var (CEP,) = 1) i} o? (8) 


5. SAMPLE MEDIAN (CEP3) 


A third estimate for CEP is the sample median of n=2k+1 observations 
of the radial error. Since the CEP is the median of the distribution F(t), given 
by (3), it is natural to suppose that the sample median would have merit in 
estimating CEP. This estimate is an example of one which is quite easy to ob- 
tain from data. 

The probability element for the median is given by 


n n+1 r? (n—1)/2 
(: 2 20? 
2 


where the preceding expression is the probability that the sample median is in 
the range r to r+dr. 


5 
i 
4 
= 
\ j 
$a 
2 
baker 
4) 
dr 
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It can be shown that the expected value of the sample median, which is de- 
noted by 7,, is 


n n+1 -1 T'(3/2) 
= > (- 


An unbiased estimate of the form 
CEP, = ——_— (10) 


can be formed, where M(n) is the numerical factor on the right side of equation 


(9). 


The second moment of the sample median can be found to be 


2 d 2 


6. MEAN RADIAL ERROR (CEP,) 


As a final estimate for CEP the arithmetic mean of the radial error can be 
used. Motivation for its use is that radial errors instead of component errors 
may be reported and under these circumstances it is relatively easy to compute. 
It is well known to be a consistent estimate of the 1st moment of the distribu- 
tion (2). 

Since 


E(r) = (3/2) 


an unbiased estimate of CEP is 


CEP, 1.1774 


\/ar(3/2) \n 


the variance of the estimate (12) is found to be 


( 4 ') 
n 


Var (CEP,) (13) 


7. SUMMARY 


For convenience the various estimates of CEP are given below 


od 
| 3 
. 
| 
+ . 
es, 
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/nT(n) 
22906 2+ y?2 
T'[3(2n + 1)] (14) 


Vnri(n 
‘EP, = 1.1774 i 2 15 
CEP, — — 8)? + (ys — (15) 


<> _l. 1774 _— (7, is sample median, M (n) is explicitly given (16) 
ve M(n) ” in equation (9) and is listed in Table 799b) 


(17) 


n i=l 


CEP, = 


1.1774 _(— 

In Table 798 the variances of the estimates are given as a function of the 

number of observations. Although the information in Table 798 can be re- 

covered from that in Table 799a, trends are more easily determined from the 
form presented. 


TABLE 798. VARIANCE OF ESTIMATES OF CEP 
(Multiply table entry by o?) 


CEP, 


In Table 799a the relative efficiency of the four estimates is given as a func- 
tion of n. 
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Estimate | 

CEP, CEP; CEP; 

2 1827 1894 

3 .1199 .1827 .1879 .1263 a 

4 .0892 .1199 .0946 
5 .0710 .0892 .1241 .0757 

6 . 0589 .0710 .0631 

7 .0504 .0589 .0925 .0541 

8 .0440 .0504 .0473 fk 

9 .0390 .0440 .0736 .0421 

10 .0351 .0390 .0378 a 

11 .0319 .0351 .0611 .0344 

: 12 0292. .0319 

13 .0269 .0292 .0523 .0291 

14 .0250 .0269 .0271 

15 .0233 .0250 0457 0252 

16 .0218 .0233- .0236 

17 .0205 .0218 .0406 .0222 ay. 

18 .0194 .0205 .0210 

19 .0184 .0194 .0365 .0199 

20 .0174 .0184 -0189 

21 .0166 .0174 .0333 .0180 Ae 


CIRCULAR PROBABLE ERROR 
TABLE 7998. EFFICIENCY OF ESTIMATES 
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bins 

“ CEP; CEP, CEP; CEP, 
n 

2 100.00 48.23 96.45 

3 100.00 65.64 63 .80 94.97 

4 100.00 74.37 94.17 

5 100.00 79.57 57.19 93 .66 

6 100.00 83.03 93 .32 

7 100.00 85.48 54.47 93 .07 

8 100.00 87 .32 92.88 

9 100.00 88.75 52.99 92.73 

10 100.00 89.88 92.61 

11 100.00 90.81 52.10 92.51 

12 100.00 91.59 J 92.43 

13 100.00 92.24 51.43 92.36 

14 100.00 92.80 92.30 

15 100.00 93.28 50.97 92.24 

16 100.00 93.70 92.20 

17 100.00 94.08 50.62 92.16 

18 100.00 94.41 92.12 

19 100.00 94.70 50.34 92.09 

20 100.00 94.97 92.06 

100.00 95.21 50.10 92.03 


TABLE 799b. UNBIASING FACTOR FOR ESTIMATES OF o 
(In Parenthesis 1.1774 times the Unbiasing Factor) 


Unbiasing 
Factor 


V/nT(n) 


/nT(n—1) 


(A) 


M(n) 


1 


(3/2) 


.0638 (1.2525) 
.0424 (1.2273) 
.0317 (1.2147) 
.0253 (1.2072) 
.0210 (1.2021) 
.0180 (1.1986) 
.0157 (1.1959) 
.0140 (1 
.0126 (1.1922) 
.0114 (1.1908) 
.0105 (1.1898) 
.0097 (1.1888) 
.0090 (1.1880) 
.0084 (1.1873) 
.0078 (1.1866) 
.0074 (1.1861) 
.0070 (1.1856) 
.0066 (1.1852) 
.0063 (1.1848) 
.0060 (1.1845) 


1939) 


.5958 (1 


(1 
.1534 (1 
.1231 (1 
.1028 (1 
.0883 (1 
0774 (1 
0688 (1 
.0620 (1 
.0564 (1 
(1 
.0478 (1 
0444 (1 
0414 (1 
(1 
0366 (1 
.0346 (1 
.0327 (1 
.0311 (1 


.8789) 
.5340) 
.4171) 
.3580) 
.3223) 
.2984) 
.2914) 
.2685) 
.2584) 
.2504) 
.2438) 
.2383) 
.2337) 
.2297) 
.2261) 
2232) 
2205) 
.2181) 
.2159) 
.2140) 


3029 (1 


.8254 (.9718) 
.8339 (.9818) 
.8380 (.9867) 
.8404 (.9895) 
.8419 (.9913) 
.8430 (.9925) 
.8438 (.9935) 
(.9942) 
.8449 (.9948) 


-8454 (.9954) 


.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 
.7979 (.9394) 


: 
4 
| | 
: 
| 
: 
8 | 
: 
16 | 
19 | 
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It is apparent from Table 799a that the median is a relatively poor estimate 
while, to the contrary, the mean radial error is very good. 

In order to obtain unbiased estimates of the CEP from sample data certain 
~ corrective factors must be computed and applied. For convenience those factors 
are shown in Table 799b. 

While the inefficient estimates of CEP will not be used frequently since the 
best estimate is not difficult to compute, they have been included to give an 
idea of what their relative efficiencies are, and also because certain of the in- 
efficient estimates, such as the median “7,,” may have applications in fields of 
interest other than the estimation of CEP. 
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A NOTE ON MEAN SQUARE SUCCESSIVE DIFFERENCES 


J. N. K. Rao 
Iowa State University 


This paper deals with the use of higher order mean square successive 
differences for estimating dispersion when a strong trend is present in 
the mean value and the method of first differences does not adequately 
eliminate the trend. The first four moments of the mean square suc- 
cessive second difference, 5:?, are derived in samples from an arbitrary 
population with constant mean (or with linear trend in the means). Be- 
cause the efficiency of 5,? relative to s* increases with 62, the approximate 
distfibution of 52% is discussed only for a leptokurtic population, speci- 
fically for the symmetrical two-tailed exponential population. For popu- 
lations with varying mean the first two moments of 4:? are given and the 
effect of trend of the mean values on these moments is discussed. Short 
tables of the efficiency of the mean square successive second, third, and 
fourth differences, all relative to s* are given. 


1, INTRODUCTION 


HE estimation of dispersion from successive differences, when successive ob- 
"T aetdiia made at regular intervals of time are subject to the same stand- 
ard error and the means of the populations from which they are drawn display 
some kind of trend, is well known. Moore [3] has given a historical account of 
methods, based on the first variate difference, for estimating dispersion; and 
has discussed the properties of the mean square successive first difference. 


n—1 
= (n — 1)" (2441 — 2,)* 

in samples drawn from various populations. 
Estimators based on higher order successive differences may be useful when 

a strong trend is present and the method of 4: differences does not adequately 
eliminate the trend. Morse and Grubbs [4} have illustrated the use of esti- 
mators based on successive differences of higher order and have given a table 
of efficiencies of mean square successive diiierences up to the tenth order, 
relative to the sample variance ; 


= (n — 1) (x, — 2)! 


t=] 


for different sample sizes from a normal population. Kamat [1] has quoted 
three examples from Tintner to illustrate the extent to which second differences 
succeed in eliminating trend, and has derived the first four moments of two 
estimates based on the second variate difference: 
(i) the mean square successive second difference 


n—2 
an (n 2)-! Zz (tise 2241 + 2;)* 
i=l 


(ii) the mean absolute successive second difference 
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n—2 

d, = (n — 2)" | 22541 + x; | 
i=l 


where the z; are a random sample from a normal population. Kamat has also 
discussed the approximate distributions of these estimates. Kamat has con- 
tinued his discussion of first and second successive differences [2]. 

In this paper we derive the first four moments of 6,” in samples from an arbi- 
trary population with constant mean (or with linear trend in the means). Be- 
cause the efficiency of 6,? relative to s* increases with 82, we discuss the approxi- 
mate distribution of 5,? only for leptokurtic populations, and specifically for the 
symmetrical two-tailed exponential population. For populations with varying 
mean we give the first two moments of 6,? and discuss the effect of trend of the 
mean values on these moments. We supplement Moore’s work on the ef- 
ficiency of 6, by giving short tables of the efficiency of the mean square succes- 
sive second, third, and fourth differences, all relative to s*. 


2. MOMENTS OF 562” 


Let yu, (r=2, 3, - - - ) be moments about the mean y;’ of the sampled popu- 
lation; let X;=2;—,,'; denote the central and non-central moments of 6,’ by 
and (5,7). Then 


n—2 
(n — 2)8? = — + 


= (0 > X2 5X2 — X22 5X.) (2.1) 


i=l 


n—1 n—2 


i=l t=1 


so that 
(n — 2)"u,’ (5.2) = aC > X2 — — — — 5X,? 
t=1 


n—2 r 
a(2 > X X2Xi1 - X,X,-1) + 2( 


i=l i=] 
Evaluating this expectation in terms of moments, we obtain, after some tedious 
algebra: 


(nm — 2) pr’ = — 12y2, 
(nm — = + 32u2")n — (92u4 + 
(n — 2)8y3(d2") = (21646 + — 1800u5? — 1248y2°)n 
— (61246 + 5316u4ue — 5184u3;? — 3624y,'), 
(n -- 2) 62?) = (388844? + 6912u4u2? + 307 2y2')n? (2.3) 
+ (1296ys + 24192pu6ue — 55296usus + 9728 
— 79296 + 23424 — 
— (3932us + 76336u6ue — 170368usu3 + 71772u,? 
— 168432ugu2? + 770563242 — 67320p2') 
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For normal populations, after substituting u;=0, us=0, 
us=105u2', equations (2.3) reduce to 


(m — = (6n — 

(m — = (140n — 352)ur*, 

(m — 2)%us(82") = (7392n — 21504) us’, 

(n — 2)*u4(62”) = (58800n? + 322080n — 1631232) us". 


(2.4) 


Equations (2.4) agree with the moments given by Kamat [1]. We note that 
6.2/6 is an unbiased estimate of ye. The efficiency of 6,7 and 6,”, relative to s*, 
for several values of 82= 4/2” and n may be read from Table 803. 


TABLE 803. PERCENT EFFICIENCY OF 4,2 RELATIVE TO s?, AND 
IN PARENTHESES 6,2 RELATIVE TO s*? (TAKEN FROM MOORE) 


1 2 3 4 5 6 10 wo 
n 

5 19 (40) 37 (64) 47(73) 52(77) 56(80) 58(82) 64(86) 74{ 91) 

10 10 (20) 36(57) 49(69) 56(76) 61(80) 65(82) 73(87) 86( 95) 

15 7(13) 36(54) 50(68) 58(75) 64(80) 67(82) 76(88) 91( 97) 

20 5(10) 35(53) 50(68) 59(75) 65(80) 69(83) 78(89) 93( 98) 

25 4( 8) 35(53) 50(68) 59(75) 65(80) 70(83) 79(89) 94( 98) 

9( 0) 35(50) 51°67) 61(75) 68(80) 73(83) 83(90) 100 (100) 


Several interesting results emerge from Table 803. For fixed n, the efficiency 
of 6,” (or of 5,7) increases with £2. For fixed 82 less than or equal to 4.26 (for 6,”) 
or 2.34 (for 5,?), the efficiency steadily decreases with n; for greater 82, the ef- 
ficiency decreases to a minimum and then increases to a limiting value. The 
variation of efficiency with n, almost negligible with 6,’, are fairly marked with 
when is greater than 4. 


3. APPROXIMA1.; DISTRIBUTION OF 62” FROM A 
LEPTOKURTIC PARENT POPULATION 


Moore [3] has discussed the distribution of 6,2 from the exponential popu- 
lation 


p(x) = <z<+o, 


for which po,=(2r)!, wor41=0. To approximate the distribution of 6,2, we sub- 
stitute these moments in equation (2.3), yielding: 


(231936n — 666816)? 
(992n — 2512) (3.1) 
2952192n? + 84657408n — 292556160 
(992n — 2512)? 


= 


= 


} 
= 
aS 4 
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These betas, tabulated in Table 804 for selected values of n, lie in the Pearson 
Type VI region, close to the log-normal line (cf. Moore [3, p. 444], for 6,7). 

Following Moore, we attempt to approximate to the distribution of 6? by 
taking 4:*/o? to be distributed as 6x?/v; thus reproducing the first two moments 
of 


= 6, = (248n — 628)/(n — 2)?. 


TABLE 804. MOMENT RATIOS OF 6? IN SAMPLES FROM 
TWO-TAILED EXPONENTIAL POPULATION 


First approximation 


Bi Bo 


6.7174 15.4749 2.4881 .2153 
3.0322 8.5975 5.3850 -4856 
1.9565 6.6053 8.2866 .9654 
1.4440 5.6587 11.1890 -7150 
1.1441 5.1057 14.0917 .5677 
0.7553 4.3853 21.3492 .3747 
0.5614 0.0322 28 .6070 .2797 


Columns 4, 5, 6 of Table 804 give the values of v, 6:, and 82 so obtained. The 
approximation is not very good; the discrepancies are strikingly similar to those 
in Moore [3, p. 448], 


4. BIAS OF ESTIMATORS 


Let 0; be the mean of the population when the observation 2; is taken; and 
let A°0;/o0 = (0:42 —20;41+6,)/o be small, so that its cube and higher powers may 
be neglected. By expanding the expectations of the first and second powers of 


(n — = b> — — — + (ei — 04) + (4.1) 
t=} 


we get 


= + (A?6;)?/6u2(n — 2)| 


i=l 
(3n — | (98 + 8)n — (238; + 19) 
— + + A?0,_3 + 


n—2 n—3 n—4 


i=] i=] 


which agrees with equations (3) and (4) of Kamat [2] for the normal popula- 


q 
| 
Bs 

10 7.8230 

20 5.2284 

4.4481 

40 4.0725 pad 

3.8516 
3.5621 

100 3.4195 
mee 

(4.2) Ly 
' 

| 
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tion. Equations (4.2) may be compared with the corresponding expressions for 
6? and 4?, taken from Moore [3] with a correction of his equation (5.2): 


i-1 


= 3(n — — (382 — 1) — — AA) 


+ (= [A@,]? — 40.801) (4.3) 


(8?) + (0; — 8)?/ma(n — 


= [= : (6; — 


n n(n — 1) (n — 1)?ue int 


where @ is the mean of the 6;. 

We see that if the mean values contain a strong polynomial-like trend, such 
that >>(A°;)?/6 < >-(A0,)?/2< (6;—8)?, then 6,? will have smaller bias than 
6,7 and much smaller bias than s?. For linear trend, the bias of 6, is zero, and its 
variance is given by (2.3). For quadratic trend, say 


6; (ao ayt + art”)o, 


equations (4.2) reduce to 


uy’ (52?/6) = + 2ae?/3) (4.5) 
Ho(522/6) = (3n — + 8)n — (2382 + 19) — + 16a,*]. 


Kamat [2, p. 101] gives the bias in 6,? for normal population and quadratic 
trend. 

When A’, is smaller (in mean square) than A@,;, the increase in variance will 
be smaller for 6,2/6 than for 6,2/2; accordingly, 5.2 may be preferable to 6,° when 
first differences are insufficient to eliminate trend. Kamat [2] gives an example 
in which 6,? is heavily biased whereas 6,” eliminates the bias. 


5. EFFICIENCY OF 53? AND 642 RELATIVE TO 8”. 
We define the mean square successive third difference 


n—3 


83? = (n — (ts — + — 2s)? 
and the mean square successive fourth difference 


n—4 


52 = (n — 4) (wigs — + Origa — + 


t=] 


The methods of Section 2 yield 


¢ 
Wier 
/ 
| 
AN 
by 
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(m — 3)yr’ (53?) = 20(n — 
(n — 3)%ua(ds") = w2?[(40082 + 648)n — (147662 + 2316) ], 
(n — = 70(n — 4) me, 
(nm — 4)%u2(542) = u2"[(490082 + 11040)n — (234808, + 52120) ]. 
The efficiency of 5;? and 6, relative to s*, is shown in Table 806; its behaviour 
as a function of n and §z is similar to that of the efficiency of 6.*. We shall not 
discuss in detail the properties of these statistics, since second differences 


usually suffice to eliminate trend, so that the use of the inefficient statistics 
based on third and fourth differences can be avoided. 


TABLE 806. PERCENT EFFICIENCY OF 4;? RELATIVE TO s?, 
AND IN PARENTHESES OF 34,2 RELATIVE TO 2? 


10 


26 (20) 37 (29) 44 (35) 53 (44) 61 (52) 
27 (22) 40 (34) 49 (42) 59 (52) 69 (62) 
27 (22) 41 (35) 50 (44) 61 (55) 72 (66) 
27 (23) 42 (36) 51 (45) 62 (56) 73 (68) 
27 (23) 42 (36) 52 (46) 63 (57) 74 (69) 
28 (23) 43 (37) 52 (47) 64 (59) 76 (71) 
28 (24) 43 (38) 53 (48) 66 (61) 77 (73) 
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NOTES ABOUT AUTHORS 


ARTHUR COHEN, 26, 2600 East 21st Street, Brooklyn 35, is a candidate for the 
Ph.D. in mathematical statistics at Columbia University. He received his B.A. in mathe- 
matics at Brooklyn College (1955) and his M.A. in mathematical statistics from Columbia 
(1958). From 1957 to 1959, he served as statistician with the Epidemic Intelligence Service, 
United States Public Health Service, Atlanta, Georgia. This is his first publication. His 
major interests are mathematical statistics and biostatistics and this paper is an extension 
of his M.A. thesis at Columbia University 

SAMUEL MAURICE COHN, 44, has been Chief, Fiscal Analysis, Office of Budget 
Review, Bureau of the Budget, since 1955. He received his B.A. in mathematics from 
the University of Pennsylvania (1936) and has done graduate work in economics at the 
same university. Prior to his present position, he served as Fiscal Analyst and Economist, 
Bureau of the Budget (1947-55); Economic Analyst, Office of War Mobilization and Re- 
conversion (1946-47); a member of the United States Army Air Force and the Finance 
Division of U. 8. forces, European Theater (1943-45); Economic Statistician, Redistribu- 
tion Division, War Production Board (1942); Research Analyst, Philadelphia Housing 
Authority (1939-40); Research Assistant, Industrial Research Department, Wharton 
School, University of Pennsylvania (1938-39 and 1941-42); and Statistical Clerk, Na- 
tional Research Project (1937). This is his first publication in JASA, but he assisted 
Gladys L. Palmer in writing two University of Pennsylvania Press monographs and is the 
author of numerous Bureau of the Budget publications. He is especially interested in 
public finance and economic forecasting. About this article, he says: 

“I have been concerned, over the years, by the number of economic forecasters 
who have used official Federal budget estimates as if they were forecasts. Even in 
hearings before Congressional Committees—including the Joint Economic Committee 
—executive branch witnesses have been taken to task for poor ‘budget forecasts.’ 
The rise of surveys of consumer intentions and of business investment plans has 
contributed to the fallacy that the Federal budget was the same kind of ‘intentions 
survey’ for the Federal Government. Of course, it is not. Although it might be con- 
strued as an intentions survey for the executive branch, this is a far cry—in our 
political system of checks and balances—from an intentions survey for the Govern- 
ment as a whole. 

“When Arthur Burns was Chairman of the Council of Economic Advisers, he 
and I discussed at some length a few of the issues involved. He recognized many of 
the problems, and was concerned that blind use of the official estimates might result 
in poor forecasts and lead to improper policy evaluations. He encouraged me to 
think Ry the general problem and of ways to improve specific estimating techniques 
as well. 

“One possible solution to the general problem, it seemed to me, was to get the 
word to economic forecasters—to have them learn to use the Federal budget as a 
kind of benchmark, rather than a ‘bible,’ and to do their own forecasting of likely 
departures from it. I therefore seized the opportunity when George Garvy asked me 
to give a paper at the 1959 annual meeting of the Association. He wanted me to 
help the members understand why the 1959 budget estimates were turning out to 
be so wide of the mark. I broadened the subject somewhat in the hope that I could 
help statisticians and forecasters understand the nature of the budget animal and 
thus improve forecasts goumres 5 Eventually, I am sure, we will have an ‘intentions 
survey’ for Government expenditures. Perhaps this article is a step on the way.” 


BERNARD GEORGE GREENBERG, 40, has been Chairman and Professor of 
Biostatistics in the Department of Biostatistics, School of Public Health, University of 
North Carolina at Chapel Hill since 1949. He received his B.S. in mathematics from City 
College of New York (1939) and his Ph.D. in experimental statistics from North Carolina 
State College (1949). Prior to his present position he served with the New York State 
Department of Health (1940-46) except for the years when he was on military leave 
(1941-46), and with the Bureau of the Census (1940). He has published articles in various 
statistical journals as well as in periodicals for applied fields relating to medicine, public 
health, and medical education. He is the author of two previous JASA publications: 
(with Sarhan) “Tables for best linear estimates by order statistics of the parameters of 
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single exponential distributions from the singly and doubly censored samples” (Mar. 
1957) and (with Wright and Sheps) “A technique for analyzing some factors affecting:the 
incidence of syphilis” (Sept. 1950). His primary fields of interest are research in order 
statistics, experimental design, and related subjects, and the training of biostatisticians. 
About the present article he and the co-author say: “This article was a tangential out- 
growth of the authors’ primary interest in order statistics. In working with many struc- 
tured matrices in that field, the authors felt that more research would bring out the 
pattern of the relationships and simplify the process of inversion. An earlier paper of 
Sarhan’s with 8. N. Roy was found to be a special case of several results in this one. 
The present effort represents the tenth article in statistical journals and monographs 
jointly written by the two authors. It constituted the basis of an invited paper at the 
International Statistical Institute Meetings in Brussels, 1958.” 

NORMAN THEODORE GRIDGEMAN, 47, has been with the Division of Applied 
Biology (Biometrics Section) of the National Research Council in Ottawa, Canada, since 
1952. He received his B.SC. in special chemistry from the University of London (1938) 
and is a Fellow of the Royal Institute of Chemistry (1951). During World War II he 
worked in industrial research with Unilever Limited in England. His publications in the 
fields of food chemistry, bioassay, horticultural experimentation, taste-testing, and bi- 
ometry have appeared in such journals as The Analyst, The Biochemical Journal, Bio- 
metrics, and Applied Statistics. This is his first publication in JASA. His interests have 
ranged from food chemistry to the design and interpretation of research experiments and 
carrying out investigations in the fields of sensory perception (especially taste). About this 
article he says, “It struck me that R. A. Fisher’s celebrated chapter still provides food for 
thought, and that no one seems to have tried to collate the views of Fisher himself, Ney- 
man, and Hogben (through his colleague Wrighton) on the rationale of the tea-tasting 
problem. What a tribute to Fisher, by the way, is the fact that this chapter, so full of 
technical and philosophical meat, can yet be reproduced in toto in a popular anthology 
(Newman’s World of Mathematics)!” 

HYMAN BENJAMIN KATTZ, 43, has been Statistical Consultant, Division of 
Manpower and Employment Statistics, U. S. Bureau of Labor Statistics, since 1958. He 
received his B.A. in statistics (1942) and his M.A. in mathematical statistics (1950), 
both from George Washington University. He has also studied at Lowell Institute, Stan- 
ford University, and the U. 8. Department of Agriculture Graduate School. Prior to his 
present position, he served as Assistant Chief, Statistics Division, World Bank (1958); 
Chief, Branch of Actuarial Methods and Estimates, and then Acting Chief, Actuarial 
Division (1954-58); Statistician, National Income Division, Department of Commerce 
(1946-53) and with the U. 8S. Army Air Force doing statistical work in psychological 
research during World War II. He has had no articles in JASA, but he contributed to 
Volumes 13 and 23 of the Conference on Income and Wealth of the National Bureau of 
Economic Research, and to America’s Needs and Resources of the 20th Century Fund. 
Other articles have appeared in the Review of Economics and Statistics, Psychometrika, 
Journal of Educational Psychology, and Proceedings of the Business and Economics Section 
of the American Statistical Association. His major interest is application of mathematical 
and statistical methods in the social sciences. 

MITCHELL O. LOCKS, 37, is Visiting Associate Professor of Business Administra- 
tion, Graduate School of Business Administration at the University of California, Los 
Angeles. He is on leave from Remington Rand Univac of St. Paul, Minnesota, as Senior 
Staff Statistician. He received his A.B. in Economics from Central YMCA College 
(1942) and his M.A. (1949) and Ph.D. (1953) both in Economics and both from the Uni- 
versity of Chicago. He has been statistician and mathematician with Remington Rand 
since 1955; Assistant Professor of Business Statistics, University of Oklahoma (1952-55) ; 
District Economist, Office of Price Stabilization (1951-52); and Lecturer in Statistics, 
University of Minnesota, Duluth Branch (1949-51). 

He is the author of papers on “Influence of Unions on Wages,” Review of Economics 
and Statistics, 1954; “Two Nonparametric Tests,” Proceedings of the Oklahoma Academy 
Academy of Science, 1954; “Computer-Centered Inventory Management,” Proceedings 
of the Business and Economics Statistics Section of the A.S.A. Annual Meetings, 1958; and 
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two papers on queuving theory, as well as a review of Pierson’s Community Wage Patterns 
which appeared in the JASA (Mar. 1954). 

His major interests are data processing and statistical applications of automatic 
digital computers. About this article he says: “It started from my interest in the possibili- 
ties of using automatic programming by statisticians and other users not interested in 
becoming professional programmers in order to have problems solved on the machines.” 

PAUL BENJAMIN MORANDA, 39, has been Staff Scientist with the Range 
Systems Operation of Aeronutronic at Newport Beach, California, since September 1959. 
He received his B.A. in chemistry from Fresno State College (1942), and his M.A. (1949) 
and Ph.D. (1953) both in mathematics fram Ohio State University. Prior to his present 
position he served as Supervisor of Numerical Analysis at Autonetics (1953-59); and 
as Research Associate in the Antenna Laboratory (1951-52) and the Cryogenics Labora- 
tory (1947-48), both at Ohio State University. This is his first JASA publication. He is 
the author (with Mann) of “On the estimation of parameters in an Ornstein-Uhlenbeck 
process” which appeared in Sankhya in 1953. His main interests are in mathematical 
statistics and numerical analysis. 

JONNAGADDA NALINI KANTH RAO, 22, has been a Graduate Assistant in 
Statistical Laboratory at Iowa State University since 1958. He received his M.A. in sta- 
tistics from Bombay University, India (1956). He was a research scholar in the Forest 
Research Institute, Dehra Dun, India (1956-58) prior to his present position. This is 
his first appearance in JASA, but his previous publications include “A characterization 
of the normal distribution,” Annals of Mathematical Statistics, 1958 as well as articles in 
the Journal of the Indian Society of Agricultural Statisties and Indian Forester. His major 
interests lie in survey sampling, mathematical statistics and economics. 

AHMED EBADA SARHAN, 38, is currently Visiting Professor of Biostatistics at 
the School of Public Health, University of ‘North Carolina at Chapel Hill. His permanent 
address and affiliation are: Assistant Professor of Biostatistics, High Institute of Public 
Health, Alexandria, Egypt. He received his B.Sc. in mathematics (1943) and his Dip.Stat. 
in statistics (1950) both from Cairo University, Egypt; his M.Sc. in mathematical sta- 
tistics from Liverpool University, England (1952); his M.S. Hyg. in Public Health from 
Harvard University (1953); and his Ph.D. in biostatistics from the University of North 
Carolina (1955). He has held various positions in Cairo University. He has published 
articles in the Annals of Mathematical Statistics, Biometrika, and other scholarly peri- 
odicals, as well as a textbook in Arabic on sampling methods. He is the author of two previ- 
ous JASA Publications: (with Greenberg) “Tables for best linear estimates by order 
statistics of the parameters of single exponential distributions from singly and doubly 
censored samples” (Mar. 1957) and “Estimation of the parameters of a skewed distribution 
by linear systematic statistics” (Mar. 1955). His primary interest is in research in order 
statistics and public health statistics. 

ROBERT GEORGE DOUGLAS STEEL, 42, has been Associate Professor of Bio- 
logical Statistics, Biometrics Unit, Department of Plant Breeding of Cornell University 
since 1952. He spent a sabbatic leave at the Mathematics Research Center, U. 8. Army, 
at the University of Wisconsin (1958-59) where he wrote this article. He received his 
B.A. (c.l.) in chemistry (1939) and his B.Sc. with honors in mathematics (1940), both 
from Mount Allison University, Canada; his M.A. in mathematics (1941) from Acadia 
University, Canada, and his Ph.D. in statistics from Iowa State College (1949). Prior to 
his present position he served as Assistant Professor of Mathematics and Agriculture 
Experiment Station Statistician at the University of Wisconsin (1949-52); Instructor in 
Mathematics, Queen’s University, Canada (1945-46); and Navigation instructor with 
the Royal Canadian Air Force (1942-45). This is his first publication in JASA, but he 
is the author of articles in Annals of Maihematical Statistics and Biometrics, and (with 
J. H. Torrie) of Principles and Procedures of Statistics which is to be published by McGraw- 
Hill Book Co. in March, 1960: His major interests lie in teaching and in the application 
of statistics in the biological sciences. About this article he says: “While writing a chapter 
on non-parametric methods for the book mentioned above, I felt that the user of the 
analysis of variance had been neglected in the matter of multiple comparisons that were 
also non-parametric.” 
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CORRIGENDA 


Readers and authors are invited to submit corrections to papers published 
in any previous issue. These will be published each year in the December issue. 


Adelman, Irma G., A Srocuastic ANALYSIS OF THE S1zE DISTRIBUTION OF 
Vrms, Vol. 53, No. 284 (December 1958), 893-904. 


The author would like to add the following references which were omitted 
when the article was originally published: 


REFERENCES 


{1] Rosenblatt, D., “The distribution of income and consumer behavior representations,” 
Econometrics, 19 (1951), 334-5. 

{2} Simon, H. A. and Bonini, C. P., “The size distribution of business firms,” American 
Economic Review, 48 (1958), 607-17. 


Ar.scombe, F. J., Rectiryine INSPECTION oF A ContTinvovs Output, Vol. 53, 
No. 283 (September 1958), 702-19. 


The author writes, “The last sentence but two in the Abstract (p. 702) 
should end with the phrase, ‘the output being held in bond for a while after it 
has passed the inspection point.’ 

“The second displayed equation in §8 (p. 712) should read 


t = (1/k) log. n.” 


Baldwin, Roger R., Cantey, Wilbert E., Maisel, Herbert, and McDermott, 
James P., Tue Optimum Srratecy in Buiacksack, Vol. 51, No. 275 
‘September 1956), 429-39. 


xdward O. Thorp (University of California) suggests the following correc- 
tions with which the authors of the article concur: 
” 


page 432, line 9: In the equation beginning “Ei, = ...,” change 
Fem...” 

page 433, line 4: Change “...the last term...” to “...the last two 
terms...” 

page 433, line 5: Change “...J and J.” to“...J and T.” 

page 433, line 19: The top half of a “>>,” or “summation sign,” is missing. 

page 435, line 3: Change “...M*(D)=19;...” to“... M+*(D)=18;...” 

page 431, line 8: Change “... not exceeding 21...” to “... not exceed- 
ing 20...” 

page 436, line 23: Change *...2...”to“*...8...” 

page 426, line 2: Change“...2y(X...” to“... 2yGX...” 

page 438, line 4: Change “Values of D Where all values” to 

“Values of D Where all values” 


Biometrika writes that their tables listed under Publications Received Vol. 
54, No. 284 (March 1959), 334, are obtainable from the Biometrika Office 
University College London, Gower Street, London W. C. 1, England, rather 
than from Cambridge University Press as the listing seems to indicate. 
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Cowden, Dudley J., A ProcepurE For CompuTiNG REGRESSION COEFFI- 
CIENTS, Vol. 53, No. 281 (March 1958), 144-50. 


The author writes: “R. J. Foote has kindly pointed out that formula (1) 
in my article is listed as formula (14.12) on page 269 of Yule ‘and Kendall, 
An Introduction to the Theory of Statistics, 11th Edition, London: Charles 
Griffin and Company, 1937. Mr. Foote has used the method extensively, and 
has stated this formula in slightly different form in a 1952 mimeographed 
publication of the United States Department of Agriculture Bureau of Agri- 
cultural Economics, ‘Calculation of Partial and Multiple Regression Correla- 
tion Coefficients—3 to 5 Variables.’ ” 


INDEX TO VOLUME 50, No. 272 (December 1955), 1429-30. 


Both articles by Paul R. Rider bear the wrong page numbers. Tue Dis- 
TRIBUTION OF THE PropucT oF MaximuM VALUES IN SAMPLES FROM A REc- 
TANGULAR DISTRIBUTION actually appears on page 1142 rather than 1064 as 
it is shown; and TruNcATED BINOMIAL AND NEGATIVE BinoMIAL DistRIBvu- 
TIONS actually appears on page 877 rather than 1142 as it is shown. 


Landis, Benson Y., A GuIpE TO THE LITERATURE ON STaTISTICs OF RELIGIOUS 
AFFILIATIONS WITH REFERENCES TO RELATED Soci Srupies, Vol. 54, 
No. 286 (June 1959), 335-57. 


The title page of this article should have carried the following footnote: 

“This paper was prepared to supplement statistics on religious affiliation 
in the forthcoming new edition of Historical Statistics of the United States com- 
piled under the joint sponsorship of the Bureau of the Census and the Social 
Science Research Council.” 


MacKinnon, William J., Compact TABLE oF TWELVE PropaBiLity LEVELS 
OF THE SYMMETRIC BINOMIAL CUMULATIVE DISTRIBUTION FOR SAMPLE 
Sizes To 1,000, Vol. 54, No. 285 (March 1959), 164-72. 


The author writes: 

“Morton Raff (U. 8. Bureau of Labor Statistics) has pointed out the need 
for the following changes on page 168. The right-hand member of equation 
(5) should read (.014/./n). (This new meaning for 6 may correctly be read 
into footnote 5.) The expression immediately following the first comma in the 
second line of footnote 2, should read (.112/./n [10, p. 296]). The footnote 
should end with a period following this substituted material, as the original 
second predication is incorrect. 

“Mr. Raff supplies the following useful information related to the approxi- 
mations on page 168. The maximum error of the two-tail probability associated 
with 2, is less than .01 for any n and is less than .001 for n2=3; the maximum 
error of the two-tail probability associated with z is less than .01 for n>6 and 
is less than .001 for n> 54. 

“Finally, a circumflex should appear over the first letter P in the last line 
of the fourth paragraph to begin on page 171.” 
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Madansky, Albert, Tue Firtine or Straicut Lines WHEN Botu VARIABLES 
ARE Suspsect To Error, Vol. 54, No. 285 (March 1959), 173-205. 


The author writes: “The following corrections should be made to my article: 


“p. 183, line 12 change z;-,, to 
p. 187, last line, change denominator of fraction to 


p. 185, line 9 from bottom, change withu to with wuorv 
p. 191, line 22, change inz to inzand y, 
line 26, change u to wor». 


“I am indebted to E. L. Kaplan for calling attention to the last three errata.” 


Neiswanger, W. A., and Yancey, T. A., PARAMETER EsTIMATES AND AUTON- 
omous GrowT#, Vol. 54, No. 286 (June 1959), 389-402. 


The authors write: “We are indebted to Professor T. W. Anderson for 
pointing out our failure to comment that the LISE estimates in Tables 396, 
398 and 399 must be identical for all estimates except those for the coefficients 
©. 25 in each equation. Since the trends inserted in the exogenous variables 
anc shocks are a multiple of z;, the LISE method obtains exactly the same 
Was and the same R,, matrices, in Chernoff-Divinsky notation, for all three 
sases. We might add that the standard errors will also be the same except for 
zs, and in addition Theil-Basmann estimates will also be identical in these 
three cases. 


Ramachandran, K. V., ON THE STUDENTIZED SMALLEST Cui-Square, Vol. 53, 
No. 284 (December 1958), 868-72. 


The author writes: “I wish to acknowledge the priority of C. R. Rao for a 
more extensive version of one of the tables (Table 744) in my article which was 
published by him in Advanced Statistical Methods in Biometric Research (p. 
233).” 
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Probability and Statistics for Business Decisions: An Introduction to Managerial Eco- 
nomics under Uncertainty. Robert Schlaifer. New York: McGraw-Hill Book Company, 
Inc., 1959. Pp. xii, 732. $11.50. 


F. J. ANscomsBe, Princeton University 


T THE thought of yet another elementary book on statistics (requiring no calculus), | 
this time for businessmen, over 700 pages long, suitable as text for a one-year 
course to economics students—one’s heart sinks. What a surprise when one summons 
up courage to open it! Comparison is suggested with another remarkable elementary 
text, Statistics, A New Approach by W. A. Wallis and H. V. Roberts. These books 
resemble each other in being meticulously written and highly original in presenta- 
tion, sc as to make much previous publication obsolete; both can be read with interest 
by the professiona: ‘*tistician as well as by the student. They differ primarily in the 
familiarity and orthouvxy of tieir content. This is an elementary exposition of the 
science of aecision making under uncertainty, by one who has accepted L. J. Savage’s 
Foundations of Statistics and related literature and is not hindered by a compulsion 
to reiterate what has been said in previous elementary texts. 

The book has been developed out of a course given to selected groups of students 
at the Harvard Graduate School of Business Administration over a period of about 
five years. Althou;;h the mathematical level is very low, it is a book for adults, and is 
not too likely to prove useful for undergraduates. The integrity of the style is quite 
astonishing. The great length is due to careful and detailed explanations of calcula- 
tions and to subtle discussions of the proper framing of problems. The line of argu- 
ment can be followed easily, there are no pointless digressions or clutter, everything 
serves the main purpose. 

The first half of the book is made up of the three introductory chapters, which ex- 
plain the notions of probability, expectation and utility; then a group of five chapters 
forming what is called Part One, bearing the title “The use of probabilities based 
directly on experience”; then twelve chapters forming Part Two, “Simple random 
processes and derived probabilities.” The purpose of Part One is to illustrate how 
probabilities can be used to find the expected costs or losses resulting from alternative 
courses of action, and tu explain these concepts. Part Two introduces some probabil- 
ity theory needed for the computation of probabilities; topics covered include joint 
and conditional probabilities, binomial and Poisson distributions and processes, the 
normal and exponential distributions. The problems presented to illustrate the theo- 
retical ideas nearly all relate to inventory control, the highpoint being a study of the 
“min-max” policy of placing an order for a fixed replacement quantity as soon as the 
stock on hand falls to a pre-determined level (chap. 15). There is also a perceptive 
chapter outlining without proofs some of the results of queueing theory (chap. 19). 


The second half of the book is concerned with statistics proper, i.e. with the inter- .- 


pretation of sample« and the design of sampling. It consists mainly of Part Three, 
“The use of informa’ .on obtained by sampling,” in twelve chapters, and Part Four, 
“The value of additional information,” in six chapters. The problems studied are 
mostly two-action problems; particularly noteworthy is a lucid treatment of accept- 
ance sampling inspection (chap. 36). Theoretical topics covered include the use of 
Bayes’ theorem with samples from finite populations, from normal populations and 
from nearly normal (incompletely specified) populations, the assessment of the value 
of sample information, optimum fixed sample size, the idea of a sequential decision 
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procedure. In addition, there is Part Five, having four chapters, on objectivist statis- 
tics, a short appendix on continuous prior distributions, and various tables and charts. 

It will be noticed that the subject matter is different from what is usually called 
Business Statistics. The reader is, however, continually made aware of the relevance 
of the ideas presented to problems in production and marketing, and the title of the 
book is perfectly fair. The illustrative problems, discussed in the text or given as exer-. 
cises at the ends of chapters, are a most attractive feature. Anyone with ten minutes 
to spare who wishes to learn at first hand what the book is like cannot do better than 
turn to p. 456 and read the case history of Mar-Pruf Finishes, Inc., a small firm oper- 
ating in a market dominated by the American Paint and Lacquer Company. Wonder- 
ing whether they dared put a new lacquer on the market, they conducted a pilot 
survey to estimate what sales might be expected . . . . And on p. 191 one may read of 
the predicament of the Warner Aircraft Engine Company, which in June, 1955, re- 
ceived an order for 10 spare ring gears from New England Airlines. Only the ten items 
were needed, but a high wastage was expected in production, because an inadequately 
controlled heat treatment followed the various expensive machining operations... . 
On p. 504 we find the President of the Grand Western Railroad having grave doubts 
about the effect on revenue of the Family-Fares Plan, and paying out $6009 to an 
agency to have a small sample of purchasers of Family-Fare tickets interviewed .... 
Each story is told with plentiful detail, and invites a splendid argument. 

A curious feature of the book is that it is entirely without references. The only 
authors to whom any theoretical development is even indirectly attributed are Pas- 
cal, Bernoulli, Bayes and Poisson (and the attribution of the negative binomial dis- 
tribution to Pascal is debatable, to say the least!). It seems to me to be bad in prin- 
ciple, even in an elementary text, to suppress history—especially when so much of the 
history is very recent. A mathematically educated reader would appreciate references 
for various questions that are briefly mentioned in the text and dismissed as “beyond 
the scope of this course.” The omission of references is most serious, however, in Part 
Five. Here the author’s purpose is destructive, to make merry over the “classical” 
notions of tests of significance and confidence intervals. It’s good clean fun in a way, 
but the author might have done better to cut it out. His account of significance ‘ests 
does not adequately reflect the troubled state of the literature of “classical” statistics, 
and is likely to provoke those who know and mislead those who do not. For the 
theorist interested in controversy, this sort of thing has been done well already by 
Savage. For the inexperienced, all controversy is confusing. And in fact significance 
tests cannot be discarded so easily. If a two-action problem is sufficiently well defined 
to permit of a Bayesian treatment, thinking of it in terms of a significance test is in- 
deed absurd. But what if it isn’t? What if (in Neyman-Pearson terminology) we 
cannot specify the whole of 2? One may hope that Part Five will be taken with a 
grain of salt, enjoyed for its wit, and not brooded over too seriously. 

To sum up: Professor Schlaifer is to be congratulated on an extraordinary achieve- 
ment. 


Statistics: An Introduction. Donald A. S. Fraser. New York: John Wiley and Sons, Inc., 
1958. Pp. ix, 398. $6.75. 


H. A. a Virginia Polytechnic Institute 


_ is an important textbook and will be welcomed particularly by teachers and 
students of elementary and intermediate courses in theoretical statistics. It is 
written at the level of the well-known book by Mood. Chapter headings read as fol- 
lows: 1. Introduction, 2. Simple Probability Distributions, 3. Discrete Probability 
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Distributions, 4. Continuous Probability Distributions, 5. Characteristics of Dis- 
tributions, 6. Sampling From Probability Distributions, 7. Sampling From Finite 
Populations, 8. Some Probability Distributions, 9. Estimation, 10. Hypothesis Test- 
ing, 11. Confidence Methods, 12. Regression Analysis, 13. Factorial Designs, 14. Some 
Techniques of Experimental Design, 15. Sampling Inspection and Sequential Analy- 
sis, 16. Nonparametric Methods. 

From this listing it will be seen that the lay-out of the book follows what is by 
now a well-established pattern. Within this orthodox framework, however, many 
special features have been introduced by Fraser. 

After a discussion of sample spaces the author explains the meaning of probability 
with the help of the results of 12,800 tosses he made with a crude die. This frequency 
approach is supplemented by symmetry arguments which in the case of a perfect 
die and many other situations permit the (non-circular) use of equiprobabl® sample 
points in evaluating probabilities. The basic operations with probabilities are then 
dealt with, including a discussion of independence and conditional probability. Stand- 
ard univariate and multivariate discrete distributions are introduced in Chap. 3. 
Continuous distributions are then treated, the normal distribution both univariate 
and bivariate being studied in detail. In Chap. 5 the author defines expected values 
and deals with moment generating functions, univariate and multivariate. Sampling 
and the most important limit theorems are then fairly briefly introduced. Chap. 7 
is an unusual one: the first introduction to analysis of variance is made in the course 
of a treatment of sampling from a finite population. The reading of much of this 
chapter may be deferred until a later more detailed account of the design of experi- 
ments. In Chap. 8 the elements of matrix algebra are introduced and pivotal methods 
employed for the solution of linear equations and the construction of orthogonal 
matrices. Next, variate transformations are treated and the standard distributions 
derived. Orthogonal transformations are used to obtain the F-distribution of the 
variance ratio in the analysis of variance. Chap. 9 on estimation includes sections on 
sufficiency and completeness. Linear models are discussed at some length, and in 
Chap. 10 on hypothesis testing the likelihood-ratio method is shown to lead to the 
usual F-tests for these models. It is good to see a brief treatment of multiple con- 
fidence methods included among the more usual material on confidence intervals. 
Simple and multiple regression are handled in Chap. 12. Factorial designs are then 
introduced. In a careful discussion of models the author draws useful distinctions be- 
tweén discrete and continuous as well as primary and secondary factors. Chap. 14 
contains concise accounts of randomization theory, analysis of covariance, and frac- 
tional replication. The last ‘o chapters are short and doubtless designed primarily 
to whet the appetite. 

From this summary it will be evident that a considerable coverage has been 
achieved. Certain more advanced or difficult sections of the book are so marked and 
may be omitted at a first reading. The writing is very clear throughout, and the book 
is thoroughly sound, as one would expect from the author. Computational techniques 
are not shunned and many examples are worked in detail. The only weakness in this 
excellent book lies, in the reviewer’s opinion, in the problems at the end of chapters. 
These are mostly simple extensions and applications of the theory and consequently 
rather routine, at least for the instructor. Aliso there is a dearth of really challenging 
exercises. 

One oversight was noted by the reviewer. In the theorem on p. 51 it is stated that 
if the probability function f(x, x2) factors: f(x, 22) =r(a1)s(xz2) then a; and zz are 
independent. It should be added that the ranges of 2, 2 must not be interdependent. 

It is safe to predict that this book will come to be widely used for basic theory 
courses in statistics. 
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Principles of Statistical Techniques. P. G. Moore. New York: Cambridge University 
Press, 1958. Pp. viii, 239. $3.75. 


Crype Y. Kramer, Virginia Polytechnic Institute 


~~ author states “that this book is an attempt to put across the main principles 
of statistical methods to students who are fundamentally interested in the p~ac- 
tical applications of the subject and are not so much concerned with the philosophical 
bases of the concepts used.” To do this the author deliberately keeps the amount of 
mathematics down to a minimum. In fact the statistics is kept down to a minimum 
in that the majority of the book deals with descriptive statistics and data reduction. 
This narrow phase but important phase in some fields is extremely well covered and 
presented. 

Words are used freely to both motivate the reader and to present him with ex- 
amples and problems on such topics as: (i) data collection, (ii) data tabulation, (iii) 
pictorial representation, (iv) frequency distributions, (v) averages, (vi) measures of 
dispersion, and (vii) time series. A very limited amount of time is spent on probability, 
significance tests, and correlation. 

One disadvantage to this book in addition to a limited coverage of topics is that the 
English system of money and English words are used which may not be familiar to 
American students. If statistics were taught in American high schools or if a student 
mainly wanted to be familiar with methods of reducing and presenting data this would 
be a good book to use. 


Basic Statistical Methods. N. M. Downie and R. W. Heath, New York ; Harper, 1959. pp. 
xii, 289. $4.50. 


Ouive Jean Dunn, University of California, Los Angeles 


HIs book was written for students of education, psychology, and sociology. It 

begins, after an introductory chapter, with a chapter on review of arithmetic, 
which should certainly prove very helpful to many students. Chapters 3 to 9 aré on 
descriptive statistics, and include a chapter on the normal curve, one on correlation, 
and one on linear regression. Chapter 10 discusses probability and the binomial dis- 
tribution. In Chapter 11, sampling is discussed and confidence intervals for the mean 
are introduced. Chapters 12 through 15 give the usual tests, ending with the single 
classification analysis of variance. Chapter 16 defines a number of statistics such as 
rank order correlation, the partial correlation coefficient, etc. Chapter 17 is on re- 
liability, validity, and item analysis, and Chapter 18 gives some of the more usual 
distribution free tests. . 

In the author’s words, the book was written “in an attempt to fill the need for an 
introductory book which is as short as possible, as clear as possible, involves as little 
mathematics as possible, and stresses computation, application, and interpretation.” 
An overall evaluation of the book is that it succeeds in being short, in involving little 
mathematics, and in stressing computation and application, but is somewhat less 
successful in being clear and in stressing interpretation. 

The student should find the book helpful in showing hira how to perform the 
various statistical procedures. The book seems to contain few errors as long as one 
considers only what to do and how to do it. Beyond this point, however, the re- 
viewer finds several weaknesses which keep this from being an excellent introductory 
text. They are: 

(1) The authors do not succeed in keeping distinct throughout the book the con- 

cept of population and parameter on the one hand and sample and statistic 
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on the other. They mention the distinction now and then, but not enough, 
since it seems to be difficult for most students. They usually do not use dif- 
ferent notations for parameter and statistic; this is perhaps best illustrated by 
a sentence on page 121: “In the first place the sampling distribution of p is 
not the same for all values of p.” 

There are ambiguous. misleading and sometimes incorrect statements scattered 
through the book. As an example, the interpretation of a confidence interval is, 
on page 120: “We can now state that the chances are 95 in 100 for this sample 
that the population mean falls within the band 67.45-72.55.” Again, in dis- 
cussing the median test on page 211, the authors say “If both of these dis- 
tributions had come from the same or from similar populations, half of the 
x-values and half of the y-values would lie above this median of 11,....” 
In some of the tests, the null hypothesis is never clearly stated. This occurs on 
page 139, where a test is made as to whether the proportions responding Yes 
on two items differed significantly. It happens again on page 149, with a Chi- 
square test, and again on page 163, in the analysis of variance. The student 
seems to find it exceedingly difficult to fill in gaps such as these. 

The necessary assumptions are sometimes confused or omitted. As an example 
of this, there are two assumptions listed on pages 81 and 82 as being necessary 
in order that a correlation coefficient may be computed; the first is “linear 
regression” and the second is homoscedasticity. Actually, when computing a 
statistic from a set of data, one need of course satisfy no assumptions whatso- 
ever; it is only five chapters later, in testing whether the population parameter 
may be zero, that assumptions are necessary. And at this point are found no 
assumptions at all. 

(5) The discussion of random sampling fails to mention why a random sample is 

desirable. 

(6) There seems to be no mention of the fact that the sample mean is approxi- 

mately normally distributed when the sample size is large. 

The reviewer feels that it is unfortunate that a textbook which could be an ex- 
cellent one fails to be for lack of comparatively small rewordings. It should be pointed 
out that the book could be made correct without making it more difficult or more 
mathematical. 


Techniques of Population Analysis. George W. Barclay. New York: John Wiley & Sons, 
Inc., 1958. Pp. XIII, 311. $4.75. 


Jacos 8. Strcet, Bureau of the Census 


HIs work represents a non-mathematical introductory text on methods of popu- 

lation analysis addressed to the general student of social statistics. It explains 
many of the more important methods of analyzing population data, including the 
newer developments, with instructions in steps in computation, and with explana- 
tions of the logic of the methods and of ways of interpreting the results. Examples 
are drawn from the data of many countries, especially the underdeveloped countries. 
Barclay’s work treats the material in a more graduated fashion, and covers a some- 
what wider range of topics, than A. J. Jaffe’s Handbook of Statistical Methods for 
Demographers, which is also intended for the general student and uses illustrations 
from many countries. In a number of respects, it differ: ‘rom the corresponding works 
by M. Spiegelman, H. Wolfenden, and P. Cox; these were written primarily for 
actuarial students, confine themselves to American or British data, and devote con- 
siderable attention to the problems of data collection. 
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The volume consists of 10 chapters and an appendix. Chapters on “The Nature of 
Demography,” “Rates and Ratios,” and “Accuracy and Error” serve as general 
introductory chapters. Then follow chapters on “The Life Table,” “The Study of 
Mortality,” “Measurement of Fertility,” “Growth of Population,” and “Migration 
and the Distribution of Population.” Chap. 9 is concerned with “Manpower and 
Working Activities.” A “Conclusion” completes the list of chapters, and an appendix 
on the “Construction of Abridged Life Tables” closes the book. 

Some of the conventional areas of population study—age-sex composition, popu- 
lation distribution, and population estimates and projections, etc.—are relegated to 
parts of various chapters and, as a result, are given limited treatment. On the other 
hand, the life table is given quite generous treatment. Certain topics comparable in 
demographic significance with “manpower,” e.g., marital and family status, are 
omitted altogether, as are the more marginal subjects of marriage and divorce, 
ethnic composition, educational status, and morbidity. Relatively little attention is 
given to graphic methods of analysis, and international standards and recommenda- 
tions receive attention largely in the form of references. A fuller treatment of graphic 
methods would have been especially desirable. 

The chapter on the “Nature of Demography” contains an important miscellaneous 
collection of facts basic to demographic analysis (e.g., sources of data, topics of 
population study, logarithms, and interpolation). Chap. 2 considers the general na- 
ture of rates and ratios and a wide range of specific rates and ratios, from percentage 
distributions to the gross reproduction rate. Including a general introduction to rates 
and ratios is an excellent idea inasmuch as a considerable part of population analysis 
involves such measures. The chapter on “Accuracy and Error” discusses principally 
procedures for finding defects in population data; their application to the total 
population, age and sex distribution, and vital registration; and procedures for re- 
vising defective data. The use of matching studies for evaluating census and vital 
statistics is omitted, although it is now a common technique in the United States. © 

Chap. 4, “The Life Table,” describes the nature and structure of the life table, the 
relations between the various functions, and briefly, problems in the construction of 
the life table. Considering this chapter and the many other references to life tables 
throughout the book, the reader is not made sufficiently aware that they are often 
merely substitutes for the use of death statistics and that they have many limitations. 
Chap. 5, “The Study of Mortality,” touches on the uses of life tables in mortality 
analysis, population estimates, and the development of hypothetical models (station- 
ary and stable populations) ; then treats the usual descriptive measures, such as crude 
death rates, death rates by age, and death rates by cause, occupation, and regions; 
and, finally, discusses the logic and procedures of standardization. The reviewer 
would have preferred a greater emphasis on the wide applicability of standardization 
in population analysis. 

Following a discussion of the peculiarities of birth statistics, Chap. 7, “Measurement 
of Fertility,” treats the various types of calendar-year rates and their uses, then the 
rates based on actual reproductive histories. These two principal types of approaches 
are compared and the logical and arithmetic differences between the various measures 
of fertility’ are considered. Finally, fertility differentials by socio-economic classes 
and by regions are discussed. This chapter impressed the reviewer as an excellent and 
sophisticated summary of the subject treated. The chapter on “Growth of Popula- 
tion” discusses measures of actual population growth, some hypothetical models of 
population growth, the age structure of actual and stable populations, and the projec- 
tion of population trends. Instructions are given for the computation of the net repro- 
duction rate, the age distribution and intrinsic rate of increase (r) of the stable popula- 
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tion, and short-term projections by age and sex. The newer material on the compu- 
tation of r and on the effects of changes in vital rates on age structure is included. 
The simple short-cut procedure for computing r as +/NRR is unfortunately omitted, 
as is the use of cohort techniques in projecting fertility. 

Chap. 8, “Migration and the Distribution of Population” stresses the problem of 
the statistical definition of migration, considers in some detail the measurement of 
the volume of migration by several methods, and touches on the effects of migration 
in the way of population redistribution and changing population composition. The 
author chooses to treat internal and international migration in combination. This 
reviewer feels that this approach confuses the questions that are to be answered, the 
types of data available, and the measures used; and the author’s treatment of the 
subject reflects this. The important possibility of measuring internal migration by a 
census question on place of previous residence is hardly mentioned, nor is the use of 
a population register; yet these represent two very valuable approaches to migra- 
tion measurement. The treatment of migration rates is suggestive but does not go far 
enough in an area which badly needs development. “Manpower and Working Activi- 
ties” is devoted largely to a discussion of the definition of economic activity, the 
demographic factors determining labor supply, and the distribution of the eco- 
nomically active population by industry, occupation, and “status.” Consistent with 
its own thesis that analyzing manpower data is “wholly descriptive,” this chapter 
barely goes beyond the discussion of the basic concepts and the simplest descriptive 
measures. Such analytic techniques as measurement of the components of change in 
the labor force, the working-force life table, use of standardization, estimates and 
projections of the labor force, ete., are not mentioned here. The “Conclusion,” con- 
taining guides to further reading and references to bibliographies and technical 
journals, is an especially useful addition to the volume. 

This reviewer noted an occasional inaccurate or imprecise statement, some exam- 
ples of which are given. In discussing the computation of the average annual rate of 
growth by two methods (p. 32), the author states that both methods of computation 
are rather crude. Since they both measure average change satisfactorily, one is led 
to ask, in what sense are they crude? This statement would be true if he were dis- 
cussing the preparation of population estimates. On p. 234, it is stated that the 
growth rate in the logistic curve rises to an assumed maximum level and returns again 
to zero; in fact, the rate of growth in the logistic shows a steady decline. Some ques- 
tionable generalizations, more correct for the less industrialized than the highly in- 
dustrialized countries of the world, appear in the book. The author repeatedly states 
that migration rates of ycung children are quite low (pp. 77 and 254) but this is cer- 
tainly not always true in connection with internal migration in highly industrialized 
countries. On p. 83 he states that registered deaths should be nearly balanced by 
sex, but this is hardly true in the very low mortality countries, even where the sexes 
are approximately balanced. 

The system of cross-referencing is excellent, worthy of imitation by other text 
writers. On the other hand, the division of material between the main text and foot- 
notes leaves much to be desired, as if there was a lack of principle in the division made. 
The instructions on methods of computation have a commendable clarity. The reader 
is warned, however, that several, presumably inadvertent, errors appear in the com- 
putational formulas and instructions. 

In spite of the weaknesses noted, the volume represents a scholarly job and is 
well recommended for the statistician or social scientist who wishes to train him- 
self in the specific techniques of demographic analysis. Thus, the book will be espe- 
cially useful to a wide variety of technicians who may have occasion to work with ' 
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population data. The demographic specialist too may profit from the up-to-date, suc- 
cinct review of the various special topics treated. Supplemented perhaps by materials 
on the basic topics omitted, this volume will prove to be a worthy text in the area of 
demographic methods. 


Multilingual Demographic Dictionary. English Section. Population Studies, No. 29. 
United Nations. New York, 1958. Pp. viii, 77. $.50. 


u1s collaborative effort is published in French and Spanish sections as well as 

English in order to provide an aid to technical translation rather than a treatise 

on demography. Part I is entitled “Text” anc! gives the definitions classified into nine 

chapters corresponding to major parts of demography. Part II is entitled “Index” 
and lists in alphabetical order the expressions defined in Part I in a logical order. 

W. G. M. 


Statistical Yearbook 1958. United Nations. New York, New York: Columbia University 
Press, 1958. Pp. 612. $8.00 cloth; $6.50 paper. 


NEw chapter has been introduced into this 10th issue of the Statistical Yearbook, 

presenting a summary in terms of index numbers of world production of primary 
commodities and manufactured goods. New tables have also been added on world 
exports and on trends in world trade compared with trends in population and produc- 
tion, as well as a table summarizing aid to under-developed countries by contributing 
countries and agencies. 

Various tables relating to manpower, agriculture, national income, and public 
finance have been modified primarily to present data in a more uniform manner. 
Tables have been omitted on expectation of life, economically active population, 
ocean freight rates, value, quantity and unit value of imports and exports, cash 
operations and illiteracy. No reasons for these omissions are given. 


R. F. 


Yearbook of International Trade Statistics 1956, Volume I and II; 1957, Volume I and II. 
Statistical Office of the United Nations, Department of Economic and Social Affairs. New 
York: Columbia University Press 1957; 1958. Paper. 1956 Volume I Pp. 629 $7.00, Vol- 
ume II Pp. 155 $1.50. 1957 Volume J, Pp. 622 $6.00, Volume IT Pp. 155. $1.50. 


WatrtuHer P. Micuaet, Ohio State University 


pps books are the seventh and eighth issues of a series which began in 1950, 
covering international merchandise trade data. The series is a continuation of 
International Trade Statistics, published by the League of Nations for the years 
1933-1939, and a number of similar compilations for earlier years. Starting with these 
issues, the Yearbook appears in two volumes. 

It is encouraging to find again additional countries included in these volumes. 
The number has increased to 115 in 1956, and 118 in 1957, as against 104 in the 1955 
issue and 42 in 1950. Ninety-eight percent of world trade is covered. The main addi- 
tions in 1956 are several of the Soviet Bloc countries and, in 1957, the Soviet Union. 
The increased number of dependent territories for which separate data have become 
available is also welcome. 

Volume I contains the country tables, preceded by four summary tables. The sum- 
mary tables are also included in Volume II which presents tables in matrix form of 
imports and exports, according to the ten sections of the Standard International 
Trade Classification (SITC). 
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For most countries the information is organized in four tables. Table 1 presents 
historical series of annual imports c.i.f. (with the exception of certain countries com- 
piling their imports f.o.b.) and exports f.o.b., usually in national currencies; gold 
imports and exports by value; conversion factors into post-1934 dollars; and volume 
and price indices for exports and imports, where available. Data generally run from 
1930. The formulas and methods of computation of the indices are noted in each case 
(usually Laspeyres or Paasche, a few Fisher’s ideal, or chained, calculated on, or 
switched to, a 1953 base). The gold series, it must be kept in mind, represents “trade 
in gold” as it appears in the trade record and not, of course, gold movements in the 
balance of payments sense which are often different by magnitude and even direction 
of net flow. Tables 2 and 3 give annual import and export weights and values, for the 
current and recent years, broken down by commodity classes. For the majority of 
the countries (77 in 1957) this breakdown is made according to the SITC into ten 
one-digit sections and three-digit groups. For the more important commodities five- 
digit items are also shown. Table 4 shows annual total imports and exports broken 
down by partner countries. The data were taken from official publications or were 
supplied by the governments, including those of the Soviet Bloc countries. The com- 
parability of the Soviet Bloc data (particularly for intra-Soviet Bloc trade) to the 
other trade data in the volume is uncertain, it is pointed out. 

Table A of the summary tables presents annual world trade, in dollars and adjusted 
for uniform coverage, by countries grouped in regions, for recent and selected earlier 
years, while Table B consists of a matrix of world exports by provenane and destina- 
tion, showing provenance by regions and the more important countries, but destina- 
tion in less detail. This matrix is based mainly on DIT data (Direction of International 
Trade, published jointly by the U.N., the I.M.F., and the I.B.R.D.). Since these data 
are unadjusted and because of the inclusion of Soviet Bloc trade estimates, omitted 
from Table A, Table B cannot be reconciled with Table A. Table B, like Table A, 
serves then mainly as a convenient means to obtain aggregates by order of magnitude, 
but the student interested in more exact regional information should recognize the 
shortcomings of a one-valued matrix of this sort. Frequently there are considerable 
discrepancies between the reports of trading partners, apart from the discrepancy 
which naturally arises when one reports c.i.f. while the other reports f.o.b. 

Volume and price indices are presented for world exports by regions in Table C, 
and broken down into manufactured goods and primary commodities in Table D. 
The regional indices were obtained by revaluing exports with applicable official or 
estimated indices at 1948 and 1953 prices for earlier and later years, respectively, and 
linking the earlier to the later series at 1950. The indices are all weighted by relative 
dollar values in 1953. There is also in Table D a price index series for primary com- 
modities with subdivisions, based on both export and import prices. This was done 
for reasons of wider coverage, but mainly because the prices of these commodities are 
largely determined by the markets in the importing countries and, consequently, 
their export prices are influenced by changes in freight rates. This series, therefore, 
indicates broad price movements of primary commodities in the world market, but, 
by virtue of its mixed construction, it cannot serve for the observation of movements 
in the terms of trade, either for the exporters or the importers of these commodities. 

Perhaps the most valuable contribution of the Yearbook is the classification of trade 
according to the SITC code, making available in one place the composition of trade 
by country in a degree of comparability which is not found in individual trade records. 
Most of the important (non-Soviet) trading countries are now covered, India being 
the most important new addition. Separate matrices for the ten SITC sections of 
commodities, based on the trade of the countries that also report a geographical 
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breakdown, was first featured in the 1955 issue when twenty countries so reported. 
The number has now increased to 26. Since these include most of the industrial and 
a number of other large trading countries, a considerable part of the flow of world 
trade is now covered in this form. Imports and exports of the 26 countries, with geo- 
graphically detailed areas and a large number of important countries, are shown for 
each of the ten (0 and 1 combined) SITC sections. 

That nearly eighty countries now report their trade statistics in the SITC form is 
a considerable achievement since the system was only introduced in 1950. However, a 
few SITC series, including that of Argentina and of Brazil’s imports, have not been 
continued in these issues. The problem with many international trade and balance of 
payments statistics is the lack of continuity in form and detail. It is to be hoped that 
the existing SITC series will be continued and the coverage further expanded, par- 
ticularly the regional reporting. 


World Economic Survey, 1958. United Nations, Department of Economic and Social Affairs, 
New York: Columbia University Press, 1959. Pp. xv, 298. $3.00. 


Ricuarp J. Foorr, Connell and Company 


— report is the fourth in the series to contain a special study of an economic 
question of general interest. A comprehensive discussion of post-war world trade 
im primary commodities and of national and international commodity policies makes 
up almost two-thirds of its contents. Emphasis is placed on the need to mitigate the 
impact of price instability,for major raw materials by some form of international 
action in order to place ecohomic development in the under-developed countries on a 
more stable footing. Evidence is presented to show that the expansion of world 
capacity has become sufficient to create problems of surpluses for most major com- 
modities that enter into world trade. Thus, export prospects of the under-developed 
countries and their resultant ability to import needed capital equipment depend 
largely upon the long-term growth of demand in the industrial countries. Govern- 
ments acting independently can do little to stabilize export proceeds from individual 
commodities. Because of the many difficulties that surround international commodity 
stabilization policies, the report indicates that attention now is being focused on 
proposals for strengthening directly the liquidity position of the under-developed 
countries. 

Analyses of general agricultural policies are given separately for Western Europe 
and the United States. A short section discusses policies that affect minerals. About 
25 pages are devoted to both long-term and short-term activities of the primary 
producing areas, using as examples the programs of individual countries. Interna- 
tional commodity arrangements are discussed in rather general terms, with particular 
emphasis on wheat, sugar and tin. 

Of particular interest to this reviewer was a long chapter on problems of primary 
commodities in centrally planned economies, as less has been published on this sub- 
ject. Here, as is generally known, the problem is one of persistent shortages rather 
than of surplus. The report discusses in detail problems of pricing in such economies. 

Part II deals with the subject suggested by the title of the report, namely, a survey 
of current economic developments in major countries of the world. The discussion 
covers trends for (a) industrial countries, (b) primary exporting countries, and (c) 
centrally-planned economies. 

This report contains a wealth of background statistics and information that would 
be difficult to assemble from other scurces. It will serve as a highly convenient ref- 


1 Regional detail for the 150 three-digit groups can be found in Commodity Trade Statistics, U. N. Statistical 
Papers, Series D. 
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erence work for many years, particularly with respect to its major topic, namely, 
government policies for primary commodities in the post-World War II period. 


Frequency of Change in Wholesale Prices: A Study of Price Flexibility, A Study Pre- 
pared for the Joint Economic Committee. United States Department of Labor, Bureau of 
Statistics. Washington, D.C.: U. S. Government Printing Office, 1959. Pp. vi, 88. $0.30. 


C. Bratt, Lehigh University 


WW all of the hue and cry on inflation, it is a pleasure to report on a factual 
analysis of price changes. The pamphlet under review was prepared under the 
direction of H. E. Riley of BLS at the behest of Senator Douglas. It provides current 
measurements comparable to those found in The Structure of the American Economy, 
published in 1939. Care is taken to describe the broad changes in the series included 
in the Wholesale Price Index since the earlier publication. To provide a contrast with 
earlier data some general measurements are given for commodities included in the 
earlier study. 

The principal measurements shown relate to frequency and amplitude of price 
change. Very helpful suggestions are made about the various kinds of influence which 
make for price changes, several of which do not relate to degree of competition; some 
are determined, for instance, by the technical nature of the collection process. In 
general, however, both frequency of change and amplitude of change are related to 
the degree of competition, although far less simply than popularly assumed. From 
1947 to 1956, the amplitude of price movements differed remarkably from average only 
in those commodities showing the most frequent price changes, which happened also 
to be those commodities least fabricated. The most important causal influence would 
seem to be degree of fabrication. 

A study of matched commodities shows that the distribution of price changes cor- 
responds closely with that occurring from 1926-33 as shown by the 1939 study. Per- 
centage distribution of the frequency of price change is roughly similar. This is a 
remarkable conclusion in view of the great difference in economic conditions between 
postwar and the earlier period. 

One of the conclusions developed in the study is that long-time price movement is 
independent of price flexibility. This conclusion appears to be of major importance be- 
cause it conforms with the assumption that competition is controlling in the long run. 

In addition to summary charts and tables, separate measurements are shown on 
frequency and amplitude of change for the 1,789 commodities included in the study. 
Weights are also given so that the interested individual can compute his own 
measurements. 


The Volume of Mortgage Debt in the Postwar Decade. Saul B. Klaman. Technical Paper 
No. 13. New York: National Bureau of Ecuasomic Research, 1958. Pp. xv, 143. $2.00, 
paper. 


ANDREW F. Brimmer, Michigan State University 


at ene in the field of monetary economics has been greatly hampered by 
numerous gaps in mortgage statistics and by the dubiousness of some of the series 
which were available. Fortunately, this situation has been largely corrected with the 
appearance of Klaman’s paper. This is such an excellent job—and the shortcomings 
are so minor—that I view my task as solely that of presenting some idea of its contents. 

This publication is the first result of the Postwar Capital Market Study undertaken ~ 
by the National Bureau and financed by a grant from the Life Insurance Association 
of America, The over-all study was described in the National Bureau’s Thirty-sixth 
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Annual Report, 1956 as “an analysis of the structure and development of the Ameri- 
can capital market in the decade 1946-1955 that ties a description of the institutional 
setting and a discussion of the major economic problems involved to an integrated 
statistical framework of the flow of funds through the capital market and of the 
assets and liabilities of financial institutions active in the market.” 

The National Bureau divided the capital market into three main sectors (govern- 
ment securities, corporate securities and loans, and nonfarm mortgages), and Klaman 
was asked to concentrate on nonfarm mortgages. The present paper is a prelude to 
his comprehensive analysis of this market and was originally intended to appear as 
part of the latter. However, because of frequent requests for permission to use some 
of the series, the National Bureau decided to publish the statistics as a separate paper. 

Because of this decision, economists and statisticians now have a comprehensive 
and reliable set of data on the total amounts of indebtedness outstanding against 
different types of properties and on mortgages held by the leading groups of investors 
(particularly by the main financial institutions) for the period 1945 through 1956. 
(Statistics on net mortgage flows are also presented, but these are simply the first 
differences in the amounts outstanding.) Only annual statistics are given through 
1952, but quarterly figures are presented for the last four years. Klaman groups his 
series into six property and three mortgage types. By property they are: nonfarm, 
residential, one-to-four family, multifamily, non-residential, and farm; the mortgage 
types are: those insured by the Federal Housing Administration (FHA), those guar- 
anteed by the Veterans Administration (VA), and those not insured or guaranteed 
(conventional). The series also show the following ownership of mortgages: main 
financial institutions (savings and loan associations, life insurance companies, com- 
mercial banks and mutual savings banks), Federal agencies, and all other holders of 
which the largest are individuals. 

Aside from the contribution which Klaman makes through new estimates of mort- 
gage debt (especially the quarterly figures) and the revision of existing series, he also 
presents a cogent appraisal of the quality of data in the field of mortgage financing. 
Information on mortgages held by individuals is particularly deficient, being of poor 
quality, scarce in quantity and appears only infrequently. Data on type of mort- 
gage borrower seldom—if ever—distinguish between corporate and noncorporate 
debtors. Statistics describing the mortgage investments of financial institutions other 
than the leading ones could also be expanded, and figures for the latter could be fur- 
ther improved. 

Klaman suggests two avenues leading to an improvement of mortgage statistics: 
(1) new and more frequent benchmarks should be developed for total mortgage debt 
and its components; (2) mortgage investors should report their holdings more fre- 
quently and in greater detail. Starts along these lines have been made in the 1950 
Census of Housing and in a special Census Bureau survey of mortgage debt on 
owner-occupied one-to-four-family properties as of December, 1956; similar ques- 
tions will probably be included in the 1960 census. Nevertheless, much remains to 
be done in this area, and Klaman presents numerous suggestions for further work. 


New England Economic Indicators, Second Edition. Chris A. Theodore. Boston: Bureau 
of Business Research, Boston University, College of Business Administration, 1957. Pp. 
vi, 90. $2.50. Paper. 


Evmer C. Bratt, Lehigh University 


His chartbook pictures changes in population, employment, agriculture, manu- 
facturing, transportation, finance, income, prices, business population, and some 
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consumption series for New England and, in some cases, separately for each of the six 
states. Most series are on a yearly basis and end with 1956. Tabular data are pro- 
vided and procedures and sources are stated. The graphs are effectively drawn and 
a scanning will give the reader a good picture of the structural changes which have 
been occurring in New England. 


Information Theory and Statistics. Solomon Kullback. New York: John Wiley and Sons, 
Inc.; London: Chapman and Hall, Ltd.; 1959. Pp. xvii, 395. $12.50. 


D. V. Linney, University of Cambridge, England 


. ne book is written around the following fundamental ideas. 
(i) Let fi(z), fe(z) be two probability densities, then 
1(1:2) = f fila) log /faz) a) 


is a suitable measure of the difference between f,(x) and f2(x): Kullback calls it the 
“mean information for discrimination in favor of f;(z) against f.(x) per observation 
from f,(x)” (p. 5). 

(ii) For fixed f(z) and fixed T(x), the minimum value of J(1:2), as fi(z) ranges 
through the class of distributions for which the expectation of 7(x) with respect to 
fi(z) is equal to a fixed value 6, is attained when f,(z) is proportional to 


eT) (2) 


where r is a function of @. (This is a form of the Cramér-Rao inequality.) (p. 38). 

(iii) As 6, and therefore 7, varies the family (2) is an exponential family with 
T(x) as a sufficient statistic, providing an unbiased estimate of 8 of minimum vari- 
ance (p. 44). 

(iv) If we wish to test the simple null hypothesis H; that the density is fo(x) against 
the composite alternative H, that the density belongs to the family (2) we may use the 
statistic /(*:2) which is the minimum value of J(1, 2) in (ii) for @=6(xz) = T(x), where 
x is the observed value; rejecting if T(*:2) is too large. (Crudely, the idea is to find 
the member of H; which is nearest to—diverges least from—H:;.) (p. 81). 

(v) If H; is composite the same object may be accomplished by using the mini- 
mum of f (*:2) as fo(x) ranges throughout H, (p. 85). 

We thus have a method of testing hypotheses about exponential families which is 
applied in Chap. 6 to multinomial populations; in Chap. 7 to Poisson populations; 
in Chap. 8 to contingency tables, and in Chaps. 9-13 to normal populations, dealing 
with tests of linear hypotheses (univariate and multivariate), tests of homogeneity 
and linear discrimination. In all cases the analysis is for samples of fixed size. The 
earlier chapters develop the fundamental ideas quoted above. The style is severely 
mathematical and in summarizing the ideas I have ignored the qualifications that are 
necessary in presenting a rigorous account. 

The book proper begins “Consider the probability spaces (X, 8, u;), i=1, 2,---” 
(p. 3) and continues in this vein. Thus (1) is written with densities defined (by the 
Radon-Nikodym theorem) with respect to a o-finite measure u(x). This is a perfectly 
proper and sensible thing to do, but a reviewer is bound to mention it because it 
affects considerably the class of potential readers. Not only is the mathematics at 
the beginning abstract but later on it becomes complex: even by p. 27 we have some 
pretty formidable expressions but by pp. 243-5 one begins to feel that we have 
moved into a realm where mathematical symbolism is perfection and the use of the 
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English language almost indecent. Nevertheless, the mathematics seems remarkably 
free from errors and the style, for someone who can take such heavy doses of sym- 
bolism, good. There are lots of problems at the end of each chapter, some of them 
dealing with practical statistical situations. There are copious references to other 
authors, numerous examples throughout the text, an excellent bibliography, a 
glossary and index, and tables of the following functions (i) log.n and nlog.n for 
n=1(1)1000, to ten places of decimals. (ii) plog(p:/pe) +qilog(q:/q2), where p;+q;=1, 
for pi, P2=0.01(.01)0.05(.05)0.95(.01)0.99 to seven places of decimals (iii) the 5% 
points of Fisher’s B? (related to non-central x*) for 8 =0(.02)5.00 and v=1(1)7. 

The merit of this book lies in the unity of the treatment based on the fundamental 
ideas. Most of the techniques that statisticians use in dealing with samples of fixed 
‘size from exponential distributions are derived by use of them. Notable exceptions 
are certain exact, or small-sample, techniques, such as Fisher’s test for the 22 con- 
tingency table. This is surely valuable for aesthetic and pedagogical reasons, besides 
the fact that one would expect that such a unified approach to statistics would sug- 
gest new developments: there are some indications that this is so in the later chap- 
ters on multivariate analysis, which present a compact and coherent account of the 
subject. One attractive outcome of the information approach is the treatment of con- 
tingency tables based essentially on }-O log(O/E) instead of the usual x? statistic, 
> (O—E)?/E. The additivity of information, but not of x*, is an appealing prop- 
erty. The student (and he will have to be an advanced student, for the basic ideas of 
probability and statistics are assumed known) will welcome the unity of ideas, but 
will undoubtedly be worried when reading outside of the text because most writers 
do not adopt Kullback’s approach. \ 

A critical evaluation of the book hinges on a discussion of the statements (i)—(v) 
above. (ii) and (iii) are statements of mathematical results but they incorporate the 
concept of unbiasedness, and they restrict the techniques to the exponential family. 
The latter is not serious—indeed an adequate account of the statistics of this family 
would be enough for most practical purposes—but the former is. To demand that a 
statistic has no bias is difficult to justify in any development of the subject, and, with 
the rising popularity of the use of the likelihood function through the writings of 
Savage, Barnard and others, is less attractive than it used to be. The use of (1) is the 
essential theme and is most open to objection. It is a pity that Kullback does not 
spend more time in the book talking around the notion—in words and not symbols. 
My own view is that he has been correct in using the “ garithmic measure, but that 
he has used it wrongly. I feel that there is a need for some measure of information 
in statistical work, quite apart from considerations of utilities and decisions, and that 
this measure should be additive. Hence the logarithm. But also the information 
should take account of our prior knowledge. If one has strong prior opinions then a 
suggested experiment may be expected to be uninformative, but if ones opinions are 
vague the same experiment may be expected to yield much information. This point 
arises in Shannon’s original work where he points out that information is essen- 
tially a statistical idea; that is that a message must be considered in relation to a set 
of messages: similarly a parameter must be considered in relation to a distribution 
of parameters. Consequently Kullback’s measure (1), which is also used by Jeffreys, 
seems to me to be wrong in principle, just as Fisher’s is. Of course in large samples 
they are all right (and all equivalent) because the effect of prior knowledge in a large 
experiment is necessarily slight. All the tests in this book are really large-sample 
ones, some of them accidentally having small-sample optimum properties: conse- 
quently Kullback’s ideas do not result in different answers from other peoples. 

That criticism of (i) is a personal one. (iv) and (v), however, can be criticized on 
the grounds that although the technique is appealing, it leads to the same criterion 
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as does the likelihood ratio principle (p. 94). Indeed, Kullback bases the distributional 
theory of his criterion on the known distributional theory of the likelihood ratio. 
Perhaps it is only because the ratio has been with us longer, but I would have thought 
that that principle was intuitively mere appealing than Kullback’s more compli- 
cated one. Also we know that the likelihood ratio test can be absurd in small samples; 
can the same be said of the information test? Unfortunately tnere is no discussion of 
small-sample properties, apart from certain inequalities. In the case of tests of uni- 
variate linear hypotheses the property that the F-test has of being uniformly most 
powerful amongst invariant tests is a stronger reason for its adoption than Kullback’s. 

To summarize: this is an advanced book on the large-sample theory of tests involv-. 
ing Poisson, multinomial and normal distributions treated mathematically by a new 
unifying approach which is open to objections but which yields results, seen to be 
satisfactory by other methods. Incidentally, it is a book only for the expert statis- 
tician; others interested in information theory will find almost nothing of value to them. 


Alcune memorie matematiche. F. P. Cantelli. Pubbl. Facolt&8 Economia e Commercio, 
Université di Roma. Milano: A. Giuffré, 1958. Pp. xxx, 448, Ital. Lire 4.000. 


Leonarp J. Savaae, University of Chicago 


. P. Canre.ui’s pioneering work in probability theory, financial and actuarial 

mathematics, and statistics is largely unknown save through indirect references. 
Many of his papers are amost inaccessible and almost all are written in Italian. This 
book, dedicated to Cantelli on his 80th birthday, reproduces a selection of 19 of the 
90 “principal papers” listed and summarized in the bibliography, which constitutes 
an important part of the book. Both the reprinted papers and the summaries con- 
tribute greatly to the accessibility of Cantelli’s work, but it is regrettable that an 
English or French translation of the summaries was not included, for these would 
have been of great interest to many who cannot read Italian. 

A rough idea of the scope of Cantelli’s research is given by listing some of the 
topics covered by the papers in the bibliography. (The numbers refer to the bibli- 
ography, where an asterisk means that the paper is reprinted in the book and “F” 
means that it is in French.): 

The strong law of large numbers (*24, 33, *55, 61F, 80); convergence in proba- 
bility theory (*23, 41, *58F); foundations of abstract probability theory (*54, 59, 
62, 67); if X is normal and Y is normal conditionally on X, when is X+Y normal? 
(*27, 88); systematic use of random variables in actuarial problems, risk theory, ete. 
(14, *15, 40, 44, 70, *74, *78); generalizations of the Bienaymé-Tchebycheff inequa]- 
ity (*15, 16, 42); laws of interest, their “divisibility,” ete. (10, *19, 34, 73, 81); tables 
of multi-cause elimination and of “mutuality,” social insurance (*19, 37, 38, 66, 79, 
87). Many other topics might have been mentioned. 

Not only does the book provide access to the works of a great man, it will also cast 
light on the history of currently widespread ideas, such as the strong law of large 
numbers, and, even more important, it may attract new attention to certain ideas, 
such as tables of mutuality, that have perhaps been too lightly passed over. 


Formeln und Tabellen der mathematischen Statistik. Ulrich Graf and Hans-Joachim 
Henning. Berlin, Géttingen, Heidelberg: Springer-Verlag, 1958. Pp. vii, 104. DM 12.60. 


GortrrieD E. Nortuer, Boston University 


pew is a new printing (incorporating some minor corrections) of the edition which 
appeared in 1953. In the preface, the authors state their intentions in approxi- 
mately the following words: “While the professional mathematical statistician finds 
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his formulae and tables in reference and text books of mathematical statistics, the 
statistical minded engineer requires a brief and convenient collection of essential 
formulae, tables and nomograms. The present booklet represents an attempt to 
provide such a collection. For this reason, any proofs and derivations have been 
omitted... .” 

The book is in four parts. (1) formulae (31 pages); (2) examples of the application 
of some of the formulae in part one (26 pages); (3) tables and nomograms (43 pages, 
20 of which are not primarily statistical dealing with factorials, squares and square 
roots); (4) references (4 pages). The subject matter covered is limited to fairly ele- 
mentary problems of statistical inference and control chart technique as found in al- 
most any elementary statistics text in the English language. 

This reviewer has doubts about the usefulness of the book. It certainly should not 
be used by anybody who hag not had a thorough grounding in the principles of sta- 
tistical inference. Indiscriminate application of the material presented in the first 
part—which is rather likely by inexperienced users of the book—can do only more 
harm than good. As a typical example, take the statement of what to do with extreme 
observations: “Among (N+1) observations z;, there is one (zy41) which is surpris- 
ingly (italics by the reviewer) large. May it be omitted from the sample on the basis 
of its small probability? . . . If mean and variance of the underlying normal popula- 
tion are unknown and Z and s* are the sample mean and variance computed from 
ty, then is to be omitted if >Z+ks....” (The definition of k 
follows.) 
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1959. $2.50. 

Pearson, E. S. and Wishart, John (Edi- 
tors). “Students” Collected Papers. New 
York, New York: Cambridge University 
Press, 1958. $3.00. 

UNESCO. Statistics on Libraries: Statistical 
Reports and Studies. New York, New 
aoe Columbia University Press, 1959. 

1.50. 
U. S. Congress, Joint Economic Commit- 


829 


« 
Pig 


830 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1959 


tee. Economic Policy in Western Europe. 
Washington, D. C.: U. S. Government 
Printing Office, 1959. $1.25. 

U. S. Department of Health, Education and 
Welfare (Walter H. Gaumnitz, Emanuel 
Reiser, and Mary Anne Harvey). Sta- 
tistics of Local School Systems: 1955-56 ; 
Rural Counties. Washington, D. C.: 


U. 8S. Departmént of Health, Education 
_ Welfare, Office of Education, 1959. 
1.50. 

Zeisel, Hans, Kalven, Harry, Jr., and 
Buchholz, Bernard. Delay in the Court: 
An Analysis of the Remedies for Delayed 
Justice. Boston, Mass.: Little, Brown and 
Company, 1959. $7.00. 
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AN INVITATION 
TO JOIN ORO 


Pioneer In Operations Research 


Operations Research is a young science, earning 
recognition rapidly as a significant aid to decision- 
making. It employs the services of mathematicians, 
phys'cists, economists, engineers, political scientists, 
psychologists, and others working on teams to syn- 
thesize all phases of a problem. 


At ORO, a civilian and non-governmental organ- 
ization, you will become one of a team assigned to 
vital military problems in the area of tactics, strategy, 
logistics, weapons systems analysis and communi- 
cations. 


No other Operations Research organization has 
the broad experience of ORO. Founded in 1948 by 
Dr. Ellis A. Johnson, pioneer of U. S. Opsearch, 
ORO’s research findings have influenced decision- 
making on the highest military levels. 


, ORO’s professional atmosphere encourages those 
! with initiative and imagination to broaden their 
scientific capabilities. 

ORO starting salaries are competitive with those 
of industry and other private research organizations. 
Promotions are based solely on merit. The “fringe” 
benefits offered are ahead of those given by many 
companies. 


The cultural and historical features which attract 
visitors to Washington, D. C. are but a short drive 
from the pleasant Bethesda suburb in which ORO is 
located. Attractive homes and apartments are within 
walking distance and readily available in all price 
ranges. Schools are excellent. 


For further information write: 


OPERATIONS RESEARCH OFFICE 


The Johns Hopkins University 


6935 ARLINGTON ROAD 
BETHESDA 14, MARYLAND 
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Mocrod Colevleting Machin 


READY DECEMBER 15! 


Mathematical Methods and Theory 
in Games, P. rogramming, 


by Samuel Karlin, Stanford University 


A rigorous, unified presentation of the concepts of game theory 
and programming theory, together with the related concepts of mathe- 
matical economics. It is intended for use as a text on the advanced 
undergraduate-graduate level, and as a reference work for those con- 
cerned with the analysis of management problems, economic studies, 


military tactics, and general operations research. 


The two major parts of Volume I, and Volume II, are independent 
of each other except for certain material of an advanced nature, and 
each part may be studied independently in conjunction with the rele- 
vant appendixes. For the reader’s convenience, a diagram showing the 


_ interrelationship of the various non-advanced portions of the work is 


contained in each volume. 


Volume I: Matrix Games, Programming, and Mathematical Eco- 


nomics 


433 pp., 16 illus., 1959—$10.75 


Volume II: The Theory of Infinite Games 
386 pp., 10 illus., 1959 —$10.75 


ADDISON-WESLEY PUBLISHING CO., INC. 
Reading, Massachusetts, U.S.A. 
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Statistical 
QUALITY CONTROL ENGINEER 


Experienced in applications of statistical techniques to man2- 
facturing and experimental design programs. 

Our new facility, completely integrated under one roof, is a 
metallurgical pilot plant located 40 miles east of Pittsburgh, 
Pa., presently engaged in specialty casting, forging, rolling, 
heat treating, powder metal development, magnetic and other 
specialty metals. 

REQUIREMENTS: Minimum BS degree. 
Prefer Master’s in Statistics or Engineering with experience 
in metals field statistical programs. Salary $8,000 to $10,000. 
Send confidential resume to: 
Mr. Clarke T. Hamilton 


Westinghouse ttectric corp. 
Metallurgical Development Facility 
Blairsville, Penna. 


MATHEMATICIAN 


An exceptional opportunity exists at the Goodyear Atomic Corpora- 
tion for a Mathematician with an M.S. Degree. The principal duties 
will require creative mathematical analysis and solution of problems 
arising in diffusion cascade theory and operations, in addition to prob- 
lems of other fields related to these operations. He will apply ad- 
vanced training and experience in Calculus, Differential Equations, 
Finite Differences, Complex Variables, and Matrix Algebra. He will 
advise and interpret, from his findings, the proper courses of action 
to be taken in the various problems under his consideration. 


Goodyear Atomic Corporation, located in southern Ohio, provides an 
excellent work atmosphere, and, in addition, the advantages of living 
in a pleasant small community with plenty of economical housing and 
many recreational facilities. 


SUBMIT RESUME SHOWING MATH COURSES 
COMPLETED AND SALARY REQUIREMENTS TO: 


GOODYEAR ATOMIC CORPORATION 
P.O. Box 628 
Portsmouth, Ohio 
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As a long term prime operating contractor for 
the AEC, we are engaged in a program en- 
tailing the design, odentens and manu- 
facture of extremely precise and complex 
electronic, electro-mechanical devices. These 
devices must meet extraordinarily high levels 
of reliability; thus Bendix is highly quality- 
conscious, a factor of prime significance in 
certain of the activities described below. 
Being an engineer-managed engineering firm, 
we believe we offer a truly professional at- 
mosphere, an environment more conducive to 
creative effort and subsequent professional 
advancement. We cordially invite you to in- 
vestigate these and many personal advan- 
tages . . . these include more relaxed, eco- 
nomical living in a highly-progressive, mod- 
erate-size city . . . modern well-rated schools 
. « « short drive to our suburban plant... 
cultural, recreational activities . . . opportuni- 
ties for advanced study if desired. 


APPLIED en you relate advanced mathematical concepts and techniques to funda- 
mental engineering problems? If so, our design and systems engineers can 
MATHEMATICIANS: supply you with an interesting variety of problems related to ultimate product 
reliability. Experience in any of the following areas would be helpful: elec- 
tronics, stress analysis, vibration, quality control, statistics, numerical analysis, 
mathematical logic, matrices, digital computer programming. 


« To undertake fund. ital studies in technical and administrative problems 
STATISTICIAN: and to serve as consultant to otter statisticians, mathematicians and engi- 
neers, This is a senior position requiring: 
1. A th gh g ding in d statistics and acquaintance with most 
of its ramifications; , 
2. The ability to supervise long-range technical projects and to present the 
— of these projects in a form palatable to engineers and manage- 
men 
Original work in statistics will be expected and the evidence of a high degree 
of creative ability is considered to be the major requirement. 


Familiar with applied th tics or statistics to be a member of project 

: ualifications for this job include a well-developed imagination, report-writing 
MATHEMATICIAN OR Sbility, ond the ability to gently elicit information from uncommunicative 


* people. Desirable qualifications include several years of industrial experience 
STATISTICIAN: and experience with electronic computers. 


With experience in engineering, quality control, reliability, applied mathe- 

TECHNICAL matics or statistics. An appreciation of the various aspects of reliability and 
of their importance in the defense effort is required in this position which 

WRITER will consist of the effective dissemination of reliability information in such a 
manner as to encourage interest and cooperation, Reliability information will 

include reliability reports, posters, publications, and—perhaps of the most 

import training courses. The primary qualifications for success in this 

position are writing ability, speaking ability and the ability to think creatively. 


RELIABILITY Capable of organizing and directing a reliability support function within the 
Reliability organization, The reliability support envisioned consists of the 
SPECIALIST: OR = rwmerical assessment of reliability together with the publishing of results and 
* research, perience in reliability, familiarity with statistical methods and 
OPERATIONS ANALYST: electronic computers, and some administrative experience would be Wesivable. 
To guide the acquisition and analysis of reliability information. Although 
industrial experience and familiarity with electronic computers are desirable, 
RELIABILITY the major requirement for this position is a statistical background involvin 
sampling, inference from samples, and data analysis from incomplete an 
STATISTICIZN: partially erroneous information. Highly desirable is a knowledge of the mathe- 
matics of reliability, including probability, communications matrices, 
reliability equations and point and interval estimates of system reliability. 
Since reliability is as yet in its infancy, the work required of this position is 
largely creative and will require a creative imagination for its successful 

accomplishment. : 


professional 
opportunities 
at 


BENDIX 


KANSAS CITY 


Your inquiry will be welcomed and han- 
dled promptly and in strict confidence. 
Kindly mail it at once to Mr. T. H. Tillman, 
Box 303-MW, Bendix, Kansas City, Mo. 
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THREE IBM CAREER OPPORTUNITIES 
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INTERNATIONAL BUSINESS 
MACHINES CORPORATION 


COMPUTER PROGRAMMING at 1BM is being extended to 
include many new areas—such as orbit computation, meteoro- 
logical satellites, space probes, information retrieval, design auto- 
mation, real-time systems, and optical studies. As a result, we are 
greatly expanding our programming staff, creating opportunities 
for people with various levels of experience. Assignments involve 
a wide variety of problems in science, business and government. 


Qualifications: Degree in Math, Statistics, the Physical Sciences, 
Engineering or Engineering Science . . . plus one year’s program- 
ming experience. 


MATHEMATICS RESEARCH at 18M involves interesting chal- 
lenges in a wide variety of areas. These include matrix algebra; 
logic; mathematical physics; and probability, communication and 
information theory. Other fields which are also being subjected to 
intensive study are numerical analysis, combinatorial topology, 
and operations research. 


Qualifications: B.S., M.S., or Ph.D. in Math, Physics, Statistics, 
Engineering Science, or Electrical Engineering — and proven ability 
to assume irnportant technical responsibilities in your sphere of 
interest. 


APPLIED MATHEMATICS offers unusually fine opportunities 
for the math-oriented. You wili be asked to apply your knowledge 
of mathematical and statistical analysis, probability, logic, and 
coding to advanced computer development problems. Assignments 
may take you inte computer systems design, component engineer- 
ing, human factors engineering, or feed-back control theory, infor- 
mation and communication theory, inertial guidance, and scientific 
programming. 

Qualifications: B.S. or Advanced Degree in Math, Physics, or Sta- 
tistics — plus related experience. 


There is a wide and diverse range of career opportunities at IBM. 
Advancement is rapid. The demands of a constantly expanding 
program of research and development, and promotion from 
within, based on individuai merit and achievemerit, make this 
possible. Working alone or on a small team, you'll find that spe- 
cialized assistance is readily available. 


For details, write, outlining your background and interests, to: 
Mr. R. E. Rodgers, Dept. 577L 

IBM Corporation 

590 Madison Avenue, New York 22, N. Y. 
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COMPUTING 
SERVICE 


...-Made to Order 
For Researchers 
and Statisticians 


Since few companies have 
enough work volume to jus- 
tify computers of their own, 
STATISTICAL maintains com- 
puting equipment to serve any 
company on a low-cost, hourly, 
as-needed basis. 


This service is built around 
the combined skills of mathe- 
maticians, statisticians, project 
engineers and programming 
specialists—-ready to work for 
you on your computing prob- 
lem. 


Here are a few of the appli- 
cations in which our computer 


STATISTICAL 


will excel in saving you time 
and money: 


e Simple and Multiple Cor- 
relations and Regressions 


e Analysis of Variance 
e Factor Analysis 


e Chi Square For A 
Contingency Table 


e Matrix Calculations 
e Linear Programming 
e Curve and Surface Fitting 


e Solution of 
Differential Equations 


Just contact our nearest office 
today for a free analysis and 
cost estimate of your problem. 


GENERAL. OFFICES: 


53 West Jackson Boulevard 
Chicago 4, Illinois 
Phone: HArrison 7-4500 


TABULATING CORPORATION 


Established 1933 


TABULATING + CALCULATING + TYPING 
TEMPORARY OFFICE PERSONNEL 


Chicago @ New York @ St. Louis 
Newark @ Cleveland 

Los Angeles @ Kansas City 

San mi 

Philadelphia @ Palo Alto 
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Irwin Books in Statistics 
STATISTICS FOR BUSINESS DEC'SIONS 


By ERNEST KURNOW, GERALD J. GLASSER, and FREDERICK R. OTTMAN, 
All of New York University 

Emphasis is not on compiling statistical information but on how to use 

it once it is compiled. The meaning and understanding of how statistical 

concepts may be utilized are stressed by tying every technique to practical 

business problem situatiors. 


QUALITY CONTROL AND INDUSTRIAL STATISTICS 


Revised Edition 
By ACHESON J. DUNCAN, The Johns Hopkins University 


Emphasis is on the presentation of the basic principles and procedures of 
statistical quality control, including treatment of the assumptions, prin- 
ciples, and theories that underlie modern quality control practice. This 
—— has been updated to incorporate all the recent developments in 
the field. 


BUSINESS AND ECOROMIC STATISTICS 


By WILLIAM A. SPURR, Stanford University, LESTER S$. KELLOGG, Deere and 
Company, and JOHN H. SMITH, The American University 


A Revised Edition of this popular text is now in preparation. 
—Write For Examination Copies— 
RICHARD D. IRWIN, INC. e HOMEWOOD, ILLINOIS 


Mathematical Statisticians 


serve as consultants in areas of statistical 
inference, probability, and experimental 


Exceptional opportunities exist at the 
Naval Weapons Laboratory for mathe- 


matical statisticians with MS and PhD 
degrees and an interest in operations re- 
search. The principal efforts of the Op- 
erations Research Group at present are 
devoted to the formulation and execution 
of extensive programs in the areas of 
target analysis, weapons system analysis, 
and missile feasibility and evaluation. 
Senior Statisticians on the staff also 


For further infomation, write to the Director, 
Computation and Analysis Loboratory. 


NWL 


design. The most advanced computing 
equipment and capable junior scientists 
are available for assistance. Starting 
salaries range from $7,510 to $11,595 per 
annum. The Naval Weapons Laboratory 
provides an excellent work atmosphere 
and, in addition, the advantages of living 
in a pleasant small community with eco- 
nomical housing and many recreational 
facilities. 


U. S. Naval Weapons Laboratory 


Department of the Navy © Dahigren, Virginic 


Please mention the Journal of the Amentcan STATISTICAL ASSOCIATION in writing advertisers 


= 
~ 
= 
“au 
ae 
i 
be 
' 


Published November, 1959 
INTRODUCTION TO MATHEMATICAL STATISTICS 


ROBERT V. HOGG, Associate Professor of Mathematics 
ALLEN T. CRAIG, Professor of Mathematics, 
both, University of Iowa 


The first publication in a series 
of mathematics texts under the general editorship of 
CARL B. ALLENDOERFER. 


This work for advanced students in mathematics, tested in preliminary 
form in classes at the University of Iowa, introduces in a reasonable and 
natural order fundamental concepts in probability and statistics and 
shows, whenever possible, seeendlations among these concepts. Topics 
treated include random sampling distribution theory, point and in- 
terval estimation, limiting distributions, distribution free problems and 
tests of statistical hypotheses. Considerable emphasis is placed on the im- 
portance of the concept of a sufficient statistic. Especially noteworthy is 
the inclusion of modern theoretical developments not found in current 
books at this mathematical level. 1959, 245 pages, $6.75 


INTRODUCTION TO PROBABILITY AND STATISTICS 

B. W. LINDGREN, Assistant Professor of Mathematics 

G. W. McELRATH, Professor and Head of the Industrial 
Engineering Division, both, University of Minnesota 

This introductory textbook presents classical and modern statistical meth- 
ods based on a preliminary, thorough treatment of the concept of proba- 
bility. Although some background in calculus is presupposed, the work 
is not based on a pure mathematical approach to statistics. Professor Paul 
Randolph, Industrial Engineering Division, Purdue University writes: 
“Chapters 1-5 make up one of the neatest packages on probability basic 
to statistics that I have ever seen. . . . The textual development is orderly, 
the examples good, and the problems superb!” 1959, 277 pages, $6.25 


ELEMENTARY MATRIX ALGEBRA 

FRANZ E. HOHN, Associate Professor of Mathematics, 
University of Illinois 

“, .. in a rather long mathematical career I have seen only a very few 
mathematical books of such excellence in every respect. . . . I can only 
wish there were more mathematical books like this one. . . .”"—John P. 
Scholz, Professor of Mathematics, New Mexico Institute of Mining and 
Technology 1958, 305 pages, $7.50 (Text edition) 


The Macmillan Company 


60 FIFTH AVENUE, NEW YORK 11, N.Y. 
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THE COMPLETE INTRODUCTION TO 


Principles of STATISTICAL ANALYSIS 


Samuel B. Richmond, Columbia University 


A comprehensive, eminently teachable 
textbook designed as an introduction to 
statistical analysis for students in busi- 
ness and economics. Detailed illustrative 
material combines with the text to pre- 
sent a thorough treatment of the collec- 
tion, analysis, and presentation of statis- 
tical data. Organized around the modern 


concept of statistical induction, book 
keeps mathematical procedures to a min- 
imum by introducing and explaining the 
techniques employed at the point of use. 
A Glossary of Equations lists each basic 
equation used in the text. “Well organ- 
ized ., . excellent.”—Advanced Manage- 
ment. 210 ills., tables; 491 pr. $6.50 


For clearer visualization of facts and figures .. . 


Handbook of GRAPHIC PRESENTATION 


Calvin F. Schmid, University of Washington 


A working guide for all concerned with 
producing meaningful, interest-rousing 
charts and graphs. Handbook shows how 
complicated data can be put into easily 
intelligible form; fully analyzes each 
basic type of chart, indicating its advan- 
tages and disadvantages in presenting in- 
formation of different kinds; gives help- 


THE RONALD PRESS COMPANY 


ful pointers on avoiding difficulties in 
construction. Includes the first detailed 
discussion of 3-dimensionals, plus scores 
of examples from a wide variety of fields. 
“A highly creditable job of producing a 
reliable and helpful handbook.”—Jour 
nal of the American Statistical Associ- 
ation. 210 ills., tables; 316 pp. $6.50 


e 15 E. 26th St., New York 10 


Statistician 


A national consumer products company located in the midwest has 
an unusual opportunity for a statistician with administrative abilities 
to work with our research laboratery personnel. This position requires 
a good background in experimertal design and attribute testing and 
setting up various test methods. Electronic computer available for data 
processing, but computer experience not necessary. Degree required. 
Advanced degree preferred. Apply to J. P. Hamilton, 


THE TONT COMPANY 
456 Merchandise Mart 
Chicago 54, Illinois 
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STATISTICAL ANALYSIS IN 
PSYCHOLOGY AND EDUCATION 


$ By George A. Ferguson, McGill University. McGraw-Hill Series in Psychology. 354 pages, 
7.00 


A textbook introducing the ideas and applications of statistics for students of the be- 
havioral sciences—education, psychology, psychiatry, and sociology. Emphasis is placed 
on the analysis and interpretation of data resulting from the conduct of experiments with 
human and animal subjects. The book provides the technology necessary for the statistical 
treatment of most sets of experimental data encountered in practice. 


QUANTITATIVE METHODS IN 
PSYCHOLOGY 


By Don Lewis, State University of lowa. McGraw-Hill Series in Psychology. Ready in January. 


A text for the increasingly important “Quantitative Methods” or “Quantitative Analysis” 
psychology courses. It is concerned with the application of non-statistical mathematical 
and graphical techniques to the clarification and interpretation of experimental data. The 
aim of the book is to provide a comprehensive survey of the quantitative and statistical 
procedures basic to the utilization of mathematical functions in describing empirical re- 
lationships and in developing theoretical schema for behavior. 


INTRODUCTION TO STATISTICAL 
REASONING 


By Philip J. McCarthy, Cornell University. 392 pages, $6.00 


Careful introduction to statistical reasoning for the student with a nonmathematical back- 
ground. The basic elements of statistical reasoning are presented in as rigorous a fashion 
as possible, assuming little prior mathematical training. Illustrative materials have been 
drawn primarily from investigations regarded in the social sciences as being important 
and fundamental. 


PROBABILITY AND STATISTICS FOR 

BUSINESS DECISIONS: An Introduction to 

Managerial Economics Under Uncertainty 
By Robert Schlaifer, Harvard University. 732 pages, $11.50 


This brilliant textbook is the first successful attempt to tie together the statistical tech- 
niques and the economics of business decisions. A Students’ Manual and Instructors’ 
Manual are available. 


Send for copies on approval 


McGRAW-HILL BOOK COMPANY, INC. 
330 WEST 42no STREET, NEW YORK 36, N.Y. 
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a new text encourages the beginner in the uses and inter- 
pretation of statistics, stressing the importance of a critical 
evaluative attitude 


ELEMENTARY STATISTICAL METHODS 
IN PSYCHOLOGY AND EDUCATION 


Paul Blommers and E. F. Lindquist 


* * * exploration in depth of a limited number of basic sta- 
tistical concepts and techniques --— emphasis on the logico- 
mathematical basis of statistics for the student with limited 
mathematical background —— a Study Manual to provide 
(1) questions designed to lead to rediscovery of many of the 
important properties considered in the text, (2) a second pre- 


sentation in a different context —— in a more practical setting 
—— of the most important aspects in the text —— Instructor’s 
Manual and Key for the Study Manual 


January 1960 $5.75 


528 pages 


HOUGHTON MIFFLIN COMPANY * Boston 
New York 16 Atlanta 5 Geneva Dallas 1 Palo Alto 


STATISTICIAN 


Ph.D.'s in Applied or Mathematical Statistics with an interest in 
operations research. Immediate openings at the intermediate and 
senior levels with the Washington Office of an expanding scien- 
tific firm, with a challenging research program involving expeti- 
mental design, mathematical models, and program evaluation. Ex- 
perience required. Apply to G. Ronald Herd, Director. 


BOOZ - ALLEN APPLIED RESEARCH, INC. 
4921 Auburn Avenue 
Bethesda 14, Maryland 
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REVIEW OF ECONOMICS AND STATISTICS 
Published by the Department of Economics, Harvard University 
Editor: SEYMOUR E. HARRIS* 


Associate Editors: Abeen Bergson, Edward H. Chamberlin, Robert Dorfman, James S. Duesen- 

berry, John T. Dunlop, John Kenneth Galbraith,* Alexander Gerschenkron, Gottfried Haberler,* 

Alvin H. Hansen,* Carl Kaysen, a Leontief,* John Lintner,* Edward S. Mason, Sumner 
H. Williams. 


H. Slichter, Arthur Smithies, John 
* Members of 3 Board 


Volume XLI November 1959 Number 4 


Recent werent Trends in the Federal Governmei:t: An Exploratory Study 

The Public Debt elations of a Metropolitan Area .... ..Werner Z. Hirsch 

The Public Debt Reconsidered: A Review Article ........ Alvin H. , nea 
Changes in the Share of Wealth Held va Top Wealth-Holders, 1922-1956 . 


Welfare, “Income, and Budget Weeds ..cccsccccessccscccece Martin David 
An Analysis of the Nature of Aggregates at Constant Prey err 
A Bias in the Seasonally Adjusted Unemployment Series and a Suggested Alter- 
The Measurement of Employment - and Prices in the Steel Industry . 
Analysis of Used Car Purchases “Mordechal E. Kreinin 
NOTES AND BOOK REVIEWS (See inside front cover) .........-+.000% 


The Review of E. ics and Statistics is published quarterly. 
Annual subscription price: $7.00. 
Order from the Harvard University Press, 
Cambridge 38, Mass. 


JOURNAL OF FARM ECONOMICS 
Published by 
THE AMERICAN FARM ECONOMIC ASSOCIATION 


Editor: Herman M. SoutrHwortH 
The Pennsylvania State University, University Park, Pennsylvania 


Volume XLI November, 1959 Number 4 
Some Further Reflections on Supply Control ............. Willard W. Cochrane 
Programming Regional Adjustments in Grain Production to Eliminate Surpluses 

Guides for Speculation About the Vertical Integration of Agriculture With Allied 

Resource Fixity and Farm Organization ............+0eeeeeeeee Clark Edwards 
Emerging Phenomenon: A Cycle in Hogs ............2-++++: Harold F. Breimyer 
Development of Revolving Finance in Sunkist Growers ...........+0.2045. 

Grace Larsen and H. E. Erdman 


Also notes, book reviews, and announcement of new bulletins and other publica- 
tions in agricultural economics. 


Published in February, May, August, November, and December. Yearly subscrip- 
tion $9. 
Secretary-Treasurer: C. Del Mar Kearl 
Department of Agricultural Economics 
Cornell University, Ithaca, New York 
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TECHNOMETRICS 


A Journal of Statistics for the Physical, 
Chemical and Engineering Sciences 


Vol. 1, No. 2 CONTENTS May 1959 
Measurements Made by Matching with Known Standards 

W. J. Youden, W. S. Connor and N. C. Severo 
Random Balance Experimentation ................. F. E. Satterthwaite 
The Application of Random Balance Designs Thomas A. Budne 


Discussion of the Papers of Messrs. Satterthwaite and Budne 
..W. J. Youden, O. Kempthorne, J. W. Tukey, G. E. P. Box, J. S. Hunter 


Quick Analysis Methods for Random Balance Screening Experiments. . 


Vol. 1, No. 3 CONTENTS August 1959 


Simplified Estimators for the Normal Distribution when Samples are 
Singly Censored or Truncated ..............+. A. Clifford Cohen, Jr. 


Control] Chart Tests Based on Geometric Moving Averages 


The Measuring Process John Mandel 
Factorial Experiments in Life Testing Marvin Zelen 


The Use of LaGrange Multipliers with Response Surfaces 
A. W. Umland and W. N. Smith 


A Statistical Model for Evaluating the Reliability of Safety Systems for 
Plant Manufacturing Hazardous Products Louis B. Kahn 


Subscription rates: $6.00 per year for members of The American Statis- 
tical Association and the American Society for Quality Control. $8.00 
for non-members. Make remittance payable to Technometrics and send 
to: American Statistical Association, 1757 K St., N.W., Washington 6, 
D.C. 
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POPULATION STUDIES 


A Journal of Demography 
Edited by D. V. Grass and E. Grepenik 


Vol. XIII, No. 1 CONTENTS July 1959 


W. D. RORRIE. The Growth of the Australian Population with Particular Reference 
to the Period Since 1947. 
R. K. KELSALIL and SHEILA MITCHELL, Married Women and Employment in 
England and Wales. 
SS T. SMITH. Some Social Characteristics of Indian Immigrants to British 
uiana. 
GALINA V. SELEGEN. Changing Features in the Soviet Population Census Pro- 
gramme. 
J. G. C. BLACKER. Fertility Trends of the Asian Population of Tanganyika. 
D. F. ROBERTS and R. E. 8. TAN 2%, A Demographic Study in an Area of Low 
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@ A measure of correlation for a 2 x 2 table and the exact test 
of its significance are introduced and explained in a way simpler 
than usually found 


@ Use of orthogonal polynomials for simpler computation of 
polynomial trends is explained and illustrated; a short table 
of orthogonal polynomials is provided in an appendix 


© Extensive mathematical tables are included in the appendix. 
To be published in January App.704 pp. Text list $7.50 
To receive approval copies, write: Box 903 
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